Evaluation 
"WRIGHTSTONE - JUSTMAN - ROBBINS in Modern 
Education © 

XT 


FORM 516 


UNZ & CO... Y. 


^ as zt — ec 
he Library BL 


H 
This book was taken from 


of 
Extension Services Department on the dte » 


last stamped. It is returnable within — 


7 days . ' P 
sl K^ R (- . 
q Er E d e * 
b $0. H | 3 
1, 86871 ' 
i : ] " 
Mo COP s : , 
Salk v). e «1 Y 
ML SEEN y i 
' . > 2 : F 
- ; : T 
ME e 
" MOL Y a 
WE y 


^ 
— 


Evaluation. 
WRIGHTSTONE : JUSTMAN * ROBBINS m M. oder n 


Education 


J. WAYNE WRIGHTSTONE - Director, 
Bureau of Educational Research, 
Board of Education of the City of 
New York 


JOSEPH JUSTMAN + Bureau of Edu- 
cational Research, Board of Educa- 
tion of the City of New York 


IRVING ROBBINS + Department of 
Education, Queens College 


D 
A, C9 ka t : 5 
s CALCU TTA pida 


AMERICAN BOOK COMPANY - New York 


NX iq { s 


( 


ST A 


J 


Copyright © 1956 by AMERICAN BOOK COMPANY 


à New York Cincinnati Chicago Atlanta Dallas San Francisco 
> 


All rights reserved, No part of this book protected by the above 
copyright may be reprinted in any form without written permission 
of the publisher. 
$ 


Wrightstone, Justman, and Robbins: Evaluation in Modern Education 


MADE IN U.S.A. E.P, 2 


- — Er — —— 
 ————"— ————————À" 


Preface 


In the modern school, increasing emphasis on the 
personal and social development of the child, as well as on his aca- 
demie achievement, has called for the corresponding development 
of a variety of techniques for evaluation. No longer do tests of intel- 
ligence and subject matter achievement alone meet the needs for 
appraisal of the aims of a comprehensive educational program. Such 
newer techniques as anecdotal records, observational methods, ques- 
tionnaires, inventories, interviews, checklists, rating scales, personal 
reports, projective methods, sociometric methods, case studies, and 
cumulative records are required to assess such objectives as knowl- 
edge and understandings, skills, interests, aptitudes, attitudes, per- 
Sonal-social adjustment, critical thinking, and healh and physical 
development. In addition, techniques are needed to evaluate such 


correlative factors as the social and economic backgrounds of pupils 


and the educational climate in which classroom and school activities 


are conducted. 


DESIGN AND SCOPE OF THIS BOOK 

This book has been designed and written to acquaint the reader 
with (a) the modern concepts of evaluation and their historical de- 
velopment; (b) the major methods and techniques of evaluation that 
are used to assess the growth, development, and status of pupils and 
the nature of situations in modern elementary and secondary schools; 
and (c) the specific tests and measures that may be used to appraise 
the major objectives of modern education. 


The authors have drawn freely upon their practical experience in 
of the largest public-school systems in 


ested the content and approach used 
iii 


educational evaluation in one 
the United States, and have t 


W Preface 


in the text in graduate courses in various colleges and universities. 
A conscious effort has been made to present a more comprehensive 
view of major evaluation techniques and methods of evaluating major 
objectives than that found in conventional textbooks on tests and 
measurements. The design of this book therefore differs in many re- 
spects from the conventional text. 

Part One consists of a discussion of the nature and scope of evalua- 
tion. It presents a brief overview of the origins and trends in measure- 
ment and evaluation. It discusses the principles, scope, and methodol- 
ogy of evaluation for present-day education. It defines briefly the types, 
uses, and qualities by means of which modern evaluation techniques 
are judged in determining their functions and contributions to mod- 
ern education. 

Part Two presents an informative and relatively nontechnical de- 
scription and discussion of an inclusive range of evaluation techniques 
as follows: short answer tests; oral and essay examinations; anecdotal 
records and observational methods; questionnaires, inventories, and 
interviews; checklists and rating scales; personal reports and pro- 
jective techniques; sociometric methods; case studies; and cumulative 
records of individual growth and progress. 

Part Three presents a review of the application of varied evaluation 
methods and techniques to assess the major objectives of modern 
education. These include: tests of achievement for the language arts, 
mathematics, and other selected courses; inventories of academic, 
personal, and vocational interests; tests of general and special apti- 
tudes; methods of assessing personal-social adjustment; scales of atti- 
tudes and values; measures of critical thinking; judgments of health 
and physical development; estimates of social and economie back- 
ground; and methods of evaluating classroom practices or climate and 
School activities, Representative tests, scales, inventories, or other 
techniques have been cited for each major objective. 


AUDIENCE FOR THIS BOOK 


Based upon the past experience of the authors, this book should 
appeal to the following: (a) graduate and undergraduate students in 
college or university classes; (b) guidance counselors, psychologists, 
and research personnel who face day-by-day problems in evaluating 
all aspects of growth and development of pupils; (c) school adminis- 
trators and supervisors to whom this book is a source of suggestions 
for solving problems of evaluation and guidance; and (4) teachers- 
in-service who wish to have a handbook which they may consult for 


Preface m 


aid in evaluating the major aspects of growth and development of 
their pupils. To all of these types of personnel, this book is designed 
to provide practical help in solving their varied problems on the 
assessment of the status and development of pupils, teachers, and 
school practices. So vast and specialized is present-day information 
and research that entire textbooks are devoted to such specific topics 
as evaluation of (a) aptitudes, (b) interests, (c) personal-social ad- 
justment, (d) health and physical development, (e) attitudes and 
values, and (f) classroom and school activities and programs. This 
book is more valuable for the general practitioner than for a specialist 
in a particular objective of educational practice. 


USES OF THIS BOOK IN COLLEGE AND SCHOOL SITUATIONS 


It has been indicated that this textbook is designed for use by edu- 
cational personnel in college and school situations. The specific uses 
will be related to the problems of the users. In colleges the use may 
be more sequential and systematic than in a particular school situa- 
tion where a specific answer to a specific problem is sought. 

Even in college, however, the textbook is designed for flexible use. 
There is no magic formula in a chapter-after-chapter approach. The 
sequence of chapters represents the judgment of the authors on their 
preferred organization and presentation of the various topics. Flexible 
assignments of chapters may be used with equal effectiveness in a 
college course. 


For the guidance counselor, psychologist, research personnel, school 


supervisor, or teacher in a practical school situation, the use of this 
book may be even more flexible. At one extreme, it may be used as a 
handbook to consult on a specific problem, e.g., an aptitude test for 
music or a test of foreign language achievement. On the other hand, 
it may be used and read to provide a broad overview of the newer 
concepts, techniques, and objectives for evaluation in the modern 
School. 

Bibliographic references have been grouped at the end of each 
chapter. References in the text are identified by numbers set in 
parentheses which correspond to the proper source in the list at the 
end of the chapter. 


Contents 


PART ONE—NATURE AND SCOPE OF EVALUATION 


1 


2 


3 


4 


Origins and Trends in Measurement and Evaluation 
Principles, Scope, and Methodology of Evaluation 
Types, Uses, and Qualities of Major Evaluation Techniques 


Administrative Aspects of an Evaluation Program 


PART TWO—MAJOR EVALUATION TECHNIQUES 


5 


6 


7 


8 


Short-Answer Tests 

Essay and Oral Examinations 

Observation and Anecdotal Records 
Questionnaires, Inventories, and Interviews 
Checklists and Rating Scales 

Personal Reports and Projective Techniques 
Sociometric Methods 

Case Studies 


Cumulative Records 
vii 


101 
116 
136 
156 
172 
199 
215 


225 


viii Contents 


PART THREE—EVALUATING MAJOR OBJECTIVES AND SITUATIONS 


14 Evaluating Achievement in Language Arts and Mathematics 241 


15 Evaluating Achievement in Selected Courses 269 
16 Evaluating Interests 293 
17 Evaluating Aptitudes 308 
18 Evaluating Personal-Social Adjustment 338 
19 Evaluating Attitudes and Values 355 
20 Evaluating Thinking and Problem-Solving 377 
21 Evaluating Health and Physical Development 393 
22 Evaluating Socio-Economic Status 412 
23 Evaluating School and Teaching Practices 424 
APPENDIX—BASIC STATISTICAL CONCEPTS 447 
GLOSSARY 459 
DIRECTORY OF PUBLISHERS OF TESTS 469 
INDEX OF AUTHORS 473 


INDEX OF SUBJECTS 477 


oN Aa b wo M 


Tables 


California School Systems Reporting the Use of Various , 


Evaluative Devices 
Illustrative Personality Tests and Inventories 
Illustrative Achievement Test Batteries or Series 
Illustrative Reading Readiness Tests 
Illustrative Reading Tests 
Illustrative Tests of Work-Study Skills 
Illustrative Literature Tests 
Illustrative Tests of Expression (Grammar, Usage, 
Spelling ) 
Illustrative Handwriting Scales 
Illustrative Mathematics Tests 
Illustrative Social Studies Tests 
Illustrative Tests in the Natural Sciences 
Illustrative Tests in Music and Art 
Illustrative Foreign Language Tests 
Illustrative Industrial Arts Tests 
Illustrative Home Economics Tests 
Illustrative Business Skills Tests 


Illustrative School and Vocational Interest Inventories 
ix 


260 


271 
275 
278 
281 
284 
286 
288 
308 


Tables and. Charts 


Illustrative Intelligence Tests 

Illustrative Subject-Matter Aptitude Tests 

Illustrative Mechanical Aptitude Tests 

Illustrative Art and Music Aptitude Tests 

Illustrative Rating Scales of Behavior and Personality 
Illustrative Tests in Health Areas 

Illustrative Checklists on Health Conditions and Procedures 


Social Classes in a Community 


APPENDIX—BASIC STATISTICAL CONCEPTS 


1 


2 
3 
4 


Frequency Distribution of Scores on a Reading Test 
Computation of the Median and Quartiles 
Computation of the Mean and Standard Deviation 


Computation of the Product-Moment Correlation 


Charts 


Chart Illustrating Classification of Objective Tests 
According to Psychological Characteristics 


Chart Illustrating Classification of Tests by Technical 
Features 


317 
824 
829 
831 
843 
401 
407 
416 


448 
451 
452 
455 


81 


82 


Figures 


1 Analysis of Errors 
2 Sample Anecdotal Record Form 


3 Ink Blot Similar to Those Used in the Rorschach Test 
4 A Picture from the Thematic Apperception Test 


5 A Sociogram 


6 Personal Information and Subject Growth Sections of a 


Cumulative Record 


| 7 Personality Scale and Guidance Data; Interview Data 


Sections of a Cumulative Record 
8 Individual and Group Test Re 


9 A Handwriting Scale 


APPENDIX—BASIC STATISTICAL CONCEPTS 


1 A Frequency Histogram 


78 


181 


182 


187 


211 


232-233 


234-235 


sults of a Cumulative Record 236 


260-261 


449 


EVALUATION IN MODERN EDUCATION 


G 


i ag ti 


,- ae aed 


Nature and. Scope 
of Evaluation 


PART ONE 


Origins and. Trends 


CHAPTER ONE | . . 
in Measurement and. Evaluation 


The modern concept of evaluation which has evolved 
largely, though gradually, in recent decades, stemmed from a newer 
philosophy of education which called for the development of more 
adequate techniques of assessing pupil growth and development. This 
recent philosophy of education has emphasized the responsibility of 
the educator not only for the development of concepts, information, 
skills, and habits, but also for the stimulation of pupil growth in atti- 
tudes, appreciations, interests, powers of thinking, and personal-social 
adaptability. As these objectives have become clarified and defined 
in instructional practices, appropriate methods of assessment—both 
formal and informal—have been devised to gauge the adequacy of 
the schools programs. For example, evaluative tests have been de- 
vised to test such work-study skills as map reading; finding topics 
in reference books; using indexes or tables of contents; and reading 
charts, graphs, or tables. 

Modern evaluation differs from older forms of appraisal in several 
ways. First, it attempts to measure à comprehensive range of objec- 
tives of the modern school curriculum rather than subject-matter 
achievement only. Second, it uses a variety of techniques of appraisal, 
such as achievement, attitude, personality, and character tests. In- 
cluded also are rating scales, questionnaires, judgment scales of prod- 
ucts, interviews, controlled-observation techniques, sociometric tech- 


niques, and anecdotal records, Third, modern evaluation includes 


integrating and interpreting these various indices of behavior into an 


inclusive portrait of an individual or an educational situation. 
DIFFERENCES BETWEEN MEASUREMENT AND EVALUATION 


Monroe (8) has distinguished between measurement and evalua- 
tion by indicating that in measurement the emphasis is upon single 
3 


4 Nature and Scope of Evaluation 


aspects of subject-matter achievement or specific skills and abilities, 
whereas in evaluation the emphasis is upon broad personality changes 
and major objectives of the educational program. In function, evalua- 
tion involves (1) the identification and formulation of a comprehen- 
sive range of major objectives for a curriculum, (2) their definition in 
terms of pupil behavior to be realized, and (3) the selection or con- 
struction of valid, reliable, and practical instruments for appraising 
major objectives of the educative process or characteristics of per- 
sonal growth and development. On the other hand, measurement, 
through the use of achievement tests, yields measures of pupil at- 
tainment in subject-matter areas, especially in acquisition of skills and 
information. Although earlier test technicians recognized both the 
ability to apply information and skills in problem-solving and related 
factors, such as interests, attitudes, and personal-social adjustment, 
as learning outcomes, these technicians showed little concern about 
the measurement of the less tangible aspects of educative growth. For 
this reason, the term "measurement" as used by them in connection 


with achievement tests meant the measurement of skills, factual in- 
formation, and the like. 


The Development of Modern Evaluation 


The origins of modern evaluation £o back many decades. In the 
United States the concept of evaluation as related to measurement 
extends from rudimentary approaches beginning before 1900 to the 

. comprehensive programs of the present day. 

Scates (12) has traced the trends in measurement and evaluation 
through five decades. The incubation period, extending from 1897 to 
1906, began with the proposals of Rice and culminated in the pub- 
lication of the first Binet Intelligence Scale. The second period, from 
1907 to 1916, was characterized by the work of Thorndike and his 
students, the publication of the first standardized scales and achieve- 
ment tests, and the fight for objectivity in testing educational achieve- 
ment. The third period, from 1917 to 1996, was one of rapid expansion 
for educational measurements; during this time the exigencies of 
World War I contributed directly to the introduction of group intel- 
ligence tests and stimulated their postwar development. The fourth 
period, from 1927 to 1936, was characterized by direct attention to 
the objectives of instruction, to the evaluation of instructional out- 

. comes of all aspects of growth, and by the construction of many new 
and improved tests and techniques for personality assessment as well 


Origins and Trends in Measurement and. Evaluation 5 


as for intelligence and achievement measurement. The period from 
1937 to 1946 was marked primarily by the emphasis on projective 
personality techniques, differential aptitude tests, and application of 
techniques of factor analysis to a wide variety of test batteries. These 
trends are described more fully in subsequent paragraphs. 


THE FIRST DECADE—RISE OF TESTING (1900-1910) 


, In 1897, Joseph M. Rice (11) had published his studies entitled, 
The Futility of the Spelling Grind." In his investigation, Rice set up 
the design of an experiment for comparing the achievement of pupils 
who spent a considerable amount of time on spelling drill with that 
of others who spent less time. To obtain a measure of achievement, 
he devised a spelling test which may be considered the first modern, 
formal type of test. Rice was interested not only in the measurement 
of spelling as such but also in evaluating the strengths and weaknesses 
of the educational practices of his time. This interest represented a 
break with the idea, generally prevalent prior to 1900, that learning 
products were intangible and could be appraised only by the teacher 
of a particular class. 

During the first decade of this century, some experimentation was 
initiated with the Binet scale for evaluating intelligence, and some 
beginnings of achievement tests in the basic skills of arithmetic and 
the language arts can be discerned. Stone in 1908 and Courtis in 
1909 published tests to measure achievement in elementary arith- 
metic. One can trace, in the decades which follow the initial formal 
test construction, the rise of modern measurement and evaluation. 
Ayres (1) and Freeman (5) have described in detail these early 


stages in educational measurement. 


THE SECOND DECADE—DEVELOPMENT OF TESTING (1910-1920) 


The second decade of the century was marked by the development 
of some of the early standardized tests of intelligence and achieve- 
ment and the struggle to get them accepted by both educators and the 
public. An important milestone in the use of achievement tests was 
reached when standardized tests were employed on a large scale for 
the first time in a survey conducted by the City of New York in 1911- 
18. After this, standardized tests were used in other large cities. 

During this period many scales and tests were devised. Thorn- 
dike's (15) handwriting scale appeared in 1910. This scale included 
a series of samples of handwriting presented on a chart in order of 
merit. Based on the judgment of a jury of experts, each sample was 


6 Nature and Scope of Evaluation 


assigned a statistically determined numerical value or score. Thus a 
teacher could compare a sample of a pupil’s writing with the scale 
samples and assign to it the numerical score of the scale sample that 
it most closely resembled. This may appropriately be called the first 
scientifically constructed test or measure of educational achievement. 

In 1911, Ayres published a scale in which the samples of handwrit- 
ing were arranged in order of increasing legibility, but the scores for 
samples were established by less rigorous statistical methods than 
those of the Thorndike scale. The Hillegas Composition Scale, pub- 
lished in 1912, followed the Thorndike quality-scale pattern. In 1918, 
Buckingham published a spelling scale in which the words were ar- 
ranged in order of increasing difficulty. The criterion of difficulty was 
used by Woody in building an arithmetic scale and by Trabue in con- 
structing a language scale, both published in 1916. 

Tests published before 1916 also included a series by Starch which 
embraced his Reading Tests, Grammatical Scales, Punctuation Scale, 
Grammar Tests, and Latin Vocabulary and Reading Tests. About the 
same time, Thorndike constructed a Scale for the Merit of Drawing 
by Pupils Eight to Fifteen Years Old. Although Goddard had trans- 
lated Binet's Intelligence Scale as early as 1908, Terman did not make 
his famous Stanford Revision until 1916. 


THE THIRD DECADE—EXTENSION OF STANDARDIZED TESTING 
(1920-1930) 

In the third decade, tests both of intelligence and of achievement 
advanced rapidly. The use of intelligence tests during World War I 
for classifying personnel in the armed forces provided an opportunity 
for the employment of new tests and gave impetus to the testing 
movement. In 1919, achievement-test batteries and intelligence tests 
were issued for the first time by commercial publishers for purchase 
by school systems. Notable among these were such pioneer tests as the 
Otis Intelligence Test and the Stanford Achievement Test batteries. 
Before 1980, more than one thousand standardized tests had appeared. 


During this period the development of statistical techniques of test 
analysis also received considerable attention. 


THE FOURTH DECADE—RISE OF EVALUATION (1930-1940) 


During the 1930’s, measurement and the evaluation movement con- 
tinued to expand. It was during this decade that the Cooperative 


Test Service began to publish a large number of parallel forms of 
tests for secondary schools. Many new batteries of tests appeared. 
During this period, also, appeared such personality tests as the 


Origins and Trends in Measurement and. Evaluation 7 


Rorschach and other projective techniques. Interest inventories were 
devised. Attitude scales and sociometric techniques were developed. 
Anecdotal records were introduced as evaluation techniques. All of 
these were evidence of a serious attempt to find ways to test not only 
tangible skills and knowledges, but also the less tangible objectives 
of the modern educational program. 

It was during this period also that Tyler, Wrightstone, and others 
carried on studies evaluating newer practices in elementary and sec- 
ondary schools, while Eurich conducted several studies at the college 
level to appraise the effectiveness of college curricula and to guide 
the revision of such curricula to meet more adequately the needs of 
college students. At the same time, evaluative criteria which recog- 
nized the comprehensive and flexible nature of the schools’ curricula 
were set up for accrediting high schools and colleges. 


THE FIFTH DECADE—EXTENSION OF MEASUREMENT 
AND EVALUATION (1940-1950) 

The decade of the 1940’s was marked by a maturing and a refine- 
ment of the techniques developed during the 1930's. Various evalua- 
tion studies will indicate the scope and nature of these evaluation 
programs. In New York City the appraisal of the activity program in 
the elementary schools (9) involved a comprehensive design of evalu- 
ation. In the secondary schools the “Eight-Year Study” of thirty high 
schools (14) illustrates the broad scope of a modern evaluation pro- 
gram. The Cooperative Study of Evaluation in General Education 
of the American Council on Education for colleges used similar tech- 
niques. The Commission on Teacher Education also formulated a 


broad program of evaluation. 


Scope of Modern Evaluation—lIllustrative Studies 


DESIGNS FOR SECONDARY AND ELEMENTARY SCHOOLS 

Most comprehensive in design of the plans for evaluating sec- 
ondary and elementary schools is the Eight-Year Study directed by 
Tyler. The report on the appraisal program (14) provides a detailed 
description of the development and application of present-day evalua- 
tion instruments. These instruments were designed to measure the 


major objectives of high-school instruction, as follows: 


1. For aspects of thinking: tests of interpretation of data, ap- 


plication of principles, logical reasoning, and nature of 


proof. 


8 Nature and Scope of Evaluation 


2. For social sensitivity: tests of application to social prob- 
lems of social values, social facts, and generalizations. 

8. For civic and social beliefs: scales of social, political, and 
economic beliefs. 

4. For aspects of appreciation in literature and art: a variety 
of techniques. 


5. For interests: an inventory of personal, social, and school 
interests. 


6. For personal and social development: various self-report- 
ing scales and anecdotal records. 


In addition, various pupil-record forms for noting reading and listen- 
ing were used. These tests, scales, inventories, questionnaires, check- 
lists, pupil logs, and other records were applied as required in gather- 
ing evidence about the achievement of objectives in the curriculum in 
each of the thirty schools participating in the study, 

In connection with this study, Raths (10) has described and illus- 
trated how the newer concepts of evaluation are applied in evaluating 
the program of a progressive high school. He shows how various 
tests and instruments of appraisal for such objectives as interpreting 
data, applying scientific principles, and evaluating data may be used 
to analyze both individual pupil behavior and class performance, He 
illustrates, by case studies, the integration of various indices of be- 
havior into an inclusive appraisal portrait of the individual student. 

On the elementary level, the a 
New York City schools (9, 17) il 
of objectives, regarding basic ski 
metic, and selected work- 


in Elementary Education (7). Thi 


E S committee ask i š 
thorities to submit stateme ed recognized au 


nts concerning desirable Outcomes in sub- 


Origins and Trends in Measurement and Evaluation — . 9 


ject-matter learning and intellectual competence. In addition, the 
committee obtained statements of objectives from recognized authori- 
ties on personal development and social maturation. The committee 
used the contributions of all the specialists, as well as of general 
educators, in preparing a comprehensive statement of objectives for 
the modern elementary school. In summary form here are the nine 
goals for the elementary-school years which were formulated: 


Health, safety, and physical development 

. Social and emotional development 

Ethical behavior, personal standards, and moral values 

. Ability to assume leadership, to choose leaders wisely, to 

work in teams in neighborhoods, communities, and states 

e. Becoming a citizen valuable in his home, school, and com- 
munity today, and a good neighbor, good community 
member, and good American citizen in the world of to- 
morrow 

f. Knowledge of the physical world of plants and animals, 

nature, science, conservation, machines 


g. Esthetic development as both consumer and producer, as 
isic, literature, drama, radio 


enjoyer and creator—in art, mu c 
and television, crafts, home and community beautification 
through 


h. Competence in communicating with other people 
speaking, listening, reading, and writing 
i Ability to count, measure, compute, estimate, and reason 
quantitatively 
Each objective is defined in detail in the committee report. This 
report points out the need to review and classify the kinds of meas- 
urement and evaluation now available to appraise these objectives. 
For some of the objectives, it is recommended that new instruments 


of appraisal be constructed. 


worn 


FOLLOW-UP STUDIES AS EVALUATION METHODS 

The follow-up study is used as another phase of a larger pattern of 
evaluation. Two illustrative studies of this type may be mentioned. 
Bell’s study (2) for the American Youth Commission was a pioneer 
attempt to determine and analyze the status of a representative sam- 
ple of Maryland youth, aged sixteen to twenty-four. Interviews were 
held with more than thirteen thousand young people, sampled on the 
basis of sex, race, school status, job status, social and economic status, 
marital status, type of community, and other factors. The study was 
concerned with youth at school, home, work, play, and church. 


10 Nature and. Scope of Evaluation 


The follow-up phase of the Eight-Year Study has been reported by 
Chamberlin and others (3). Graduates from thirty experimental high 
schools were matched with graduates of traditional high schools on 
the basis of such matters as scholastic aptitude, sex, race, age, reli- 
gious affiliation, size and type of high school, size and type and loca- 
tion of community, socio-economic status of family, and extracurricu- 
lar activities in high school. A comparison was made between the sub- 
sequent success of both groups in college. College records and reports, 
special questionnaires and tests, and personal interviews were used 
to gather evidence about nine aspects of college success: intellectual 
competence, cultural development, practical competence, philosophy 
of life, character traits, emotional balance, social fitness, sensitivity to 
social problems, and physical fitness. In many of these areas, the mass 
of data for each student was correlated to behavior levels or types 
which were described briefly. 


Recent Trends in Evaluation 


The modern teacher and the supervisor are concerned with impor- 
tant functional learning outcomes, many of which are less tangible 
and less easily measured than the concepts, skills, and abilities repre- 
sented in subject-matter tests of the past several decades. The con- 
cern for the total development of the child—physical, emotional, social, 
and intellectual-has resulted in an emphasis upon a sound under- 
standing of child growth and development and of individual and 
group differences, as well as upon the personal and social adjustment 
of the pupils. This represents an emphasis upon Gestalt, or organis- 


elationships of the multi- 


" Tests of general educational development usually present informa- 
on in verbal, graphic, or other form, with the test exercises devised 


Origins and Trends in Measurement and Evaluation n 


to measure the ability of the individual to comprehend and interpret 
the material presented. This contrasts with the isolated test item 
which emphasizes the recall or recognition of items of information. 
General educational development tests usually cover such areas as 
language arts, including literature, social studies, science, and mathe- 
matics. 

The increased use of informal, or teacher-made, test exercises to 
supplement formal or standardized tests is also characteristic of re- 
cent evaluative programs. Some surveys show that the average class- 
room teacher uses five or six teacher-made tests for every standard- 
ized test employed. Informal tests are valuable in the day-by-day ap- 
praisal and guidance of pupils in specific units of classroom study. 

The development of factor analysis of mental abilities, in which 
Dr. L. L. Thurstone has been a leader, is another important new ap- 
proach. By statistical analysis of many mental ability tests, Thurstone 
isolated seven primary mental abilities. These include: a verbal fac- 
tor, which involves vocabulary and reading comprehension; a number 
factor, which involves speed and accuracy in computation; a space 
factor, which involves visualization of space relationships; a rote mem- 


ory factor; a perceptual factor, which involves discrimination of like- 


nesses and differences; a word factor, which involves naming isolated 
words at a rapid rate; and a reasoning factor, which involves finding a 
rule or principle governing a series of numbers, letters, or words. The 
mental ability tests incorporating the results of this analysis, called 
Tests of Primary Abilities, are published in batteries for various age 
levels by the Science Research Associates of Chicago. Constructed 
along somewhat similar lines are the Differential Aptitude Tests, pub- 
lished by the Psychological Corporation. The Differential Aptitude 
Tests measure similar abilities and are especially useful for guidance 
work and counseling of high-school students, for both educational and 
Vocati i urposes. 
pese pon te the measurement of reading achievement 
analyzed the factors involved in reading comprehension, and served 
as the basis for the development of the Cooperative Reading Compre- 
hension Test. In this analysis, Davis (4) identified two general factors 
and six specific factors in reading comprehension. The an dam 
are word meaning and reasoning in reading, ——— € b abil- 
ity to weave together several ideas and to show their relationships, as 
well as the ability to draw correct inferences "eres po 
The specific factors isolated are (a) ability fo: determine 118 wine 
, : iew, (b) ability to understand the writer's 

purpose, intent, or point of view, ( : ? attt to 91 
explicit statements or to get the literal meaning, (c ty ow 


12 Nature and Scope of Evaluation 


the organization of a passage, (d) ability to select the main thought 
or idea of a paragraph or passage, (e) ability to determine from con- 
text the meaning of an unfamiliar word, and (f) ability to determine 
the tone and mood implicit or explicit in a passage. 

Another development in recent evaluation studies is measurement of 
the role of the individual as well as of small groups, in studies of 
group dynamics. This trend has shown itself in several ways. One is 
the measurement of the social status of groups and individuals. 
Warner (16) and Hollingshead (6) are particularly noted for this 
type of work. A second manifestation is the work on group dynamics 
that has been carried on by various organizations, especially the Re- 
search Center for Group Dynamics at the University of Michigan. 
Another manifestation is the various sociometric analyses that have 
been made in the studies of intergroup education. 

The increasing attention that has been given to the unstructured, 
projective tests of personality is also typical of recent approaches to 
evaluation. Techniques developed for evaluating personal-social be- 
havior range from the analysis of paintings and drawings, play tech- 
niques, and sentence completion to more formal methods such as the 
Rorschach and Thematic Apperception Tests. 


Summary 


This brief survey of the origins and development of measurement 
and evaluation provides an appreciation of five decades of effort and 
research. Testing with emphasis upon the measurement of single as- 
pects of subject-matter achievement or of specific skills and abilities 
was markedly different from the newer concept of evaluation, with 
its emphasis upon the appraisal of broad personality changes, includ- 
ing interests, attitudes, powers of thinking, and personal-social adapt- 
ability. 

For convenience in discussion, the development and refinement of 
methods of measurement and evaluation are classified into five periods 
or decades. During the first decade, the modern concepts and methods 
of testing had their origin in the contributions of Joseph M. Rice and 
his test of spelling, in experimentation o 


associates, who introduced concepts of 
scaling of items according to difficulty, and statis- 


Origins and Trends in Measurement and Evaluation 13 


tically determined norms for achievement tests. During this period, 
Terman made his famous revision of the Binet scale for measuring 
intelligence. The third decade was characterized by the rapid growth 
of educational measurements. There appeared group intelligence tests 
for children, patterned after the group intelligence tests used for 
adults in World War I. Achievement test batteries of basic academic 
skills were published for purchase by school systems. Statistical tech- 
niques of test analysis received considerable attention. During the 
fourth decade, attention was directed to the formulation of methods 
and techniques for assessing growth in such objectives as attitudes, 
interests, powers of thinking, and personal-social adaptability, as well 
as to the measures of intelligence and achievement. Personality tests, 
projective techniques, interest inventories, attitude scales, and anec- 
dotal records were introduced as instruments of evaluation in schools. 
During the fifth decade, a refinement of the techniques developed 
during the previous decade was apparent. The application, as well as 
the construction, of these newer techniques is illustrated in such stud- 
ies as the Eight-Year Study of secondary schools and the appraisal of 
the activity program in the New York City schools. 

In more recent years, the following trends in evaluation have be- 
come evident: (a) The modern teacher and the supervisor are con- 
cerned with important functional learning outcomes, many of them 
less tangible and less easily measured than the subject-matter con- 
cepts, skills, and abilities of previous decades. (b) An increasing em- 
phasis on the measurement of understanding and interpretation rather 
than upon isolated information, skills, and abilities is particularly ob- 
servable in present-day tests of general educational development. (¢) 
The increased use of informal, or teacher-made, test exercises for in- 
structional purposes to supplement formal or standardized tests is also 
characteristic of recent evaluation programs. (d) The development of 
factor analysis of mental abilities may also be cited as an important 
new approach in measurement and evaluation. (e) The development 
of techniques for measuring the role of the individual, as well as of 
small groups in studies of group relationships. (£) Increasing atten- 
tion has been given to the development and refinement of unstruc- 


tured, or projective, tests of personality. 


Problems for Class Discussion 


1. In parallel columns, under the captions “Measurement” and “Evaluation,” 
list the major differences between these two concepts. 


14 Nature and Scope of Evaluation 


2. On a graphic time line or chart from 1900 to 1950, indicate for each 
decade the major trends in the measurement and evaluation movement 
in the United States. 

8. Select two of the recent trends in evaluation. For each of these trends, 
explain the educational or psychological factors contributing most directly 
to the trend. 


References Cited in This Chapter 


1. Ayres, Leonard P., “History and Present Status of Educational Measure- 
ments,” The Measurement of Educational Products, Seventeenth Year- 
book of the National Society for the Study of Education, Part II, 
Chapter I, p. 9—15. Bloomington, Ill: Public School Publishing Co., 
1918. 

2. Bell, Howard M., Youth Tell Their Story. Washington, D. C.: American 
Council on Education, 1938. 

8. Chamberlin, Dean, and others, Did They Succeed in College? New 
York: Harper & Brothers, 1942, 

4. Davis, Frederick B., “What Do Reading Tests Really Measure?", English 
Journal, 33:180—187, April, 1944. 

5. Freeman, Frank N., Mental Tests: Their History, Principles, and. Appli- 
cations (Revised Edition). Boston: Houghton Mifflin Co., 1939. 

6. Hollingshead, August de B., Elmtown’s Youth, The Impact of Social 
Classes on Adolescents. New York: John Wiley and Sons, 1949. 

7. Kearney, Nolan C., Elementary School Objectives. Report of the Mid- 
Century Committee on Outcomes in Elementary Education. New York: 
Russell Sage Foundation, 1953. 

8. Monroe, W. S., “Educational Measurement in 1920 and in 1945," 
Journal of Educational Research, 38:334-840, January, 1945. 

9. Morrison, J. Cayce, The Activity Program: A Curriculum Experiment. 
New York: Board of Education of the City of New York, 1941. 


10. Raths, Louis E., “Evaluating the Program of a School,” Educational Re- 
search Bulletin, 17:57-84, March, 1938. 


ll. Rice, Joseph M., “The Futility of the Spelling Grind,” Forum, 23: 
163-172, 409-419, April and June, 1897. 

12. Scates, Douglas E., ^F: ifty Years of Objective Measurement and Research 
in Education," Journal of Educational Research, 41:241-264, Decem- 
ber, 1947. 

18. Smith, Dora V., "Recent Procedures in the Evaluation of Programs in 
English," Journal of Educational Research, 38:262-275, December, 
1944, 

14. Smith, Eugene R., Tyler, Ralph W., and others, Appraising and Re- 
cording Student Progress. New York: Harper & Brothers, 1949. 


15. Thorndike, E. L., “Handwriting,” Teachers College Record, 11:1-98, 
March, 1910. 


Origins and Trends in Measurement and. Evaluation 15 


16. Warner, W. Lloyd, Meeker, Marcia, and Eells, Kenneth W., Social Class 
in America: A Manual of Procedure for the Measurement of Social Status. 
Chicago: Science Research Associates, 1949. 

17. Wrightstone, J. Wayne, "Evaluation of the Experiment with the Activ- 
ity Program in the New York City Elementary Schools," Journal of Edu- 
cational Research, 38:252-257, December, 1944. 


References for Further Reading 


Ayres, Leonard P., "History and Present Status of Educational Measure- 
ments," The Measurement of Educational Products, Seventeenth Year- 
book of the National Society for the Study of Education, Part II, Chapter 
I, p. 9-15. Bloomington, Ill.: Public School Publishing Co., 1918. 

This is one of the earliest and best statements of the early history of 
educational and psychological measurements. It is a basic reference and 
source of information on the origins of modern tests. 

Freeman, Frank N., Mental Tests: Their History, Principles, and. Applica- 
tions (Revised Edition). Boston: Houghton Mifflin Co., 1939. 

As an intermediate source of information, this book provides an ex- 
cellent statement of the history and applications of tests in education. It 
is an excellent source of information about trends in tests before 1940. 


Scates, Douglas E., ^Fifty Years of Objective Measurement and Research in 
Education, Journal of Educational Research, 41:241-264, December, 


1947. 


This reference pro 
ment of educational and psych 
about 1950. It is well documented 
who wishes to follow the trends in 


the present time. 


vides the most comprehensive survey of the develop- 
ological measurement from its origins to 
and is recommended for the student 
testing from its early beginnings to 


Principles, Scope, and 
Methodology of Evaluation 


CHAPTER TWO 


Evaluation has been practiced by teachers for many 
years, but the purposes, functions, and techniques of evaluation have 


| gradually been modified and improved. Through research there have 


been developed and refined new techniques and methods for the 
measurement of attitudes, interests, critical t 
social adaptability. 

Broadly defined, educational evaluation is the estimation of the 
growth and progress of pupils toward objectives or values in the cur- 
riculum. The purposes of evaluation are to provide for the collection 
of evidence which will show the degree to which pupils are progress- 
ing toward curricular goals, and to permit teachers and supervisors to 
evaluate the effectiveness of curricular experiences, activities, and in- 
structional methods. The functions of evaluation are to make provi- 
sions for guiding the growth of individual pupils, to diagnose their 
weaknesses and strengths, to point out areas where remedial meas- 
ures may be desirable, and to provide a basis for the modification of 
the curriculum or for the introduction of experiences to meet the 
needs of individuals and groups of pupils. 

Techniques of evaluation range from such informal measures as 
teacher ratings on oral recitations and teacher- 
fined and standardized measures of aptitudes, abilities, skills, inter- 
ests, and attitudes. Formerly, the major emphasis in measurement was 
on appraisal of pupil mastery of information and the skills. The chang- 
ing concepts of the curriculum have required the evaluation of pupil 
growth in other areas as well, such as physical and mental health, 
social relationships, critical thinking, appreciations and creative ex- 
pression, interests, and attitudes. In other words, new curricular em- 


phases have required the development of new techniques of measure- 
ment and evaluation. 


4* 16 


hinking, and personal- 


made tests to more re- 


Principles, Scope, and Methodology of Evaluation & le 


Steps in the Process of Evaluation 


It is possible to indicate here only the broad steps that should be 
taken in working out an evaluation program fitted to the conditions 
and needs of a particular school. These steps are arranged roughly in 
the sequence which would be followed in planning and in implement- 
ing an evaluation program. They overlap at various places, and in 
some situations, of course, the sequence might need to be modified. 


l. Formulation of the Major Objectives of the Curriculum 


Teachers and supervisors must first define the values—the objectives 
—toward which they wish to guide pupil growth and development. 
The philosophy of education in any school will be important in de- 
termining the values or objectives to be set up as goals for curriculum- 
making and, consequently, for evaluation in that school. Though the 
pattern of objectives will vary from school to school, a comprehensive 
range of major objectives would be concerned with physical and men- 
tal health, social relationships, skills and knowledges, appreciations 
and creative expression, critical thinking, interests, and attitudes. A 
School which preferred a less comprehensive range of objectives might 
Choose to stress skills and knowledges, interests, and attitudes. In any 
event, the major objectives operate directly or indirectly in all of the 
Subjects or areas of the curriculum. It is essential, therefore, that the 
Staff of a school agree as clearly as possible upon those objectives 
Which it feels are appropriate to its local needs. These objectives then 
become the guideposts in both curriculum development and evalua- 
tion. Techniques of evaluation will be selected or methods constructed 
to measure pupil growth in each major objective. 

Three methods have been used in identifying and defining curricular 
Objectives for purposes of evaluation. These are (a) curriculum anal- 
ysis, (b) conference, and (c) questionnaire and interview. 

The curriculum analysis method. may be illustrated by the studies 
of Wrightstone (8, 9), Tyler (6, 7), and Smith (5). In these studies, 
the general purpose of a curriculum is analyzed into relatively inde- 
Pendent objectives. The analysis continues to a point where the ob- 
jectives become clear and useful in teaching and in appraisal, and is 
Suspended after each objective has been defined explicitly in terms of 
the student behavior leading to its attainment. Smith (5), in reviewing 
recent evaluation studies, found nine recurring major objectives: mas- 
tery of basic skills, effective ways of thinking, development of under- 
Standings and insights as revealed in social behavior, gains in knowl- 


18 Nature and Scope of Evaluation 


edge as exhibited through conduct in personal and social situations, 
development of interests as related to activities, fostering of personal 
initiative, creative power, sincerity and potency of attitudes, and post- 
school vocational competence. In the social studies, Wrightstone (10) 
has defined such major objectives as acquiring functional information, 
interpreting facts, applying generalizations, developing work-study 
skills, organizing facts, and developing social attitudes. 

The conference method was used in the Eight-Year Study of the 
Progressive Education Association. Interschool committees were set 
up to indicate those objectives for which instruments of evaluation 
were available. Many of the objectives proposed were so vague that 
their meanings had to be clarified through definition in terms of stu- 
dent behavior. 


Raths (4) reported for the Eight-Year Study these major objectives: 


(a) THINKING, which was defined broadly to include ability to 
interpret data, ability to apply «principles of science, social 
facts, and generalizations to specific situations, ability to 
analyze the nature of proof in reasoning, and sensitivity to 
social problems and situations. 

(b) INTERESTS, Arms, and APPRECIATIONS, which were broadly 
defined to include a liking or preference for various activities, 
insight into an activity, realizing its true values, and distin- 
guishing the better activities from the worse. 

(c) arrrrupes, which were defined broadly as beliefs or opinions 
about social, economic, and political issues, and about school 
life. 

(d) STUDY SKILLS and work HABITS, such as effective use of study 

' time, effective use of various sources of data, and effective 
recording, organization, and presentation of data. 

(e) SOCIAL ADJUSTMENT, which involved such aspects of behavior 
as acceptance of one's own impulses, aggressions, human rela- 
tionships, identifications with others, fantasies, and reactions 
to authority. 

(f) cnEATIVENEsS, as characterized by original and personal ex- 
pression in language arts, graphic arts, and industrial arts. 

(g) FUNCTIONAL INFORMATION, as represented in the acquisition 
of concepts, understandings, and skills related to the various 
secondary-school subjects or curriculum areas. 

(h) FUNCTIONAL socrAL pmLosorny, which involves the integra- 
tion of all aspects of growth and development into an intelli- 
gent and co-ordinated way of living in our modern society. 


The questionnaire and interview method is best illustrated by the 
University of Minnesota study (1), wherein a questionnaire was sent. 


» Nl 
Principles, Scope, and Methodology of Evaluation 19 


to graduates of the college and an interview follow-up was made to 
determine the validity of the method. From an analysis of the re- 
sponses to the questionnaire, the objectives of the college were de- 
termined. In a similar study at Bennington College (2), the college 
records and opinions of trustees, students, and faculty were used in 
analyzing the objectives. These objectives, once they had been de- 
fined through extensive deliberations and conferences, provided the 
basis for a program of evaluation which used questionnaires, tests, 


rating scales, records, and interviews. 


2. Definition and Clarification of These Major Objectives 


It is not sufficient to make a list of major objectives; it is necessary 
to define these objectives so that their meaning will be clear. For ex- 
ample, it is essential to outline more or less specifically the skills and 
knowledges in reading, in mathematics, or in science. It is equally im- 
portant to define as clearly as possible the proper types of critical 
thinking and attitudes that should be developed in subjects such as 
social studies and science. This definition and clarification makes pos- 
sible specific application of the broad objectives to particular types of 
curriculum content and experiences, and thus constitutes a guide to 


the methods of presentation of such experiences to the pupils. Fol- 


lowing are examples of such definitions for several objectives in 


reading. 


Vocabulary—Accurate word recognition and word meaning 
in both recreational and work-type reading of an appropriate 
level of difficulty. 

Comprehension—Adequate ability to get literal meaning, to 
follow the organization of a selection, to define new words in 
context, to identify the main idea, to see relationships among , 
ideas, and to judge the tone of a passage. 

Study Skills—Locating I nformation—Among the many skills, 
some of those most commonly used in the average classroom 
include: (a) using indexes, (b) using tables of contents, (c) 
using dictionaries, (d) using card files, and (e) using such 
reference sources as atlases, maps, encyclopedias, yearbooks 
(such as The World Almanac), textbooks, magazines, and 
newspapers. 

Interests—Balanced reading diet of various types of read- 
ing materials—light and fantastic to serious and technical. 

Attitudes—Assimilation and critical appraisal of attitudes 

' implicit or explicit in reading content. 


20 Nature and Scope of Evaluation 


3. Selection of Available Tests or Measures for Each Major Objective 

After the major values or objectives have been identified and clari- 
fied for each area of the curriculum, the next logical step is to select 
available tests or measures that will provide evidence about the 
growth and development of pupils toward each major objective. Thus, 
one must determine the applicability of various standardized or pub- 
lished tests and scales for measurement of achievement in the major 
objectives in each subject-matter area. In such subjects as reading, 
mathematics, social studies, science, and language arts, some, but not 
all, of the many standardized tests published will be found to fit the 
definition of the objectives as they have been developed by the staff 
of the school. 

For example, in the field of social studies there are available tests 
of social studies concepts by Kelty and Moore for elementary-school 
and by Wesley for secondary-school children. Study skills may be 
measured by the Iowa Every-Pupil Test of Work-Study Skills. 

Critical thinking may be measured at the secondary-school level by 
the Watson-Glaser Test of Critical Thinking, the Cooperative Test of 
Social Studies Abilities, or the battery of tests devised in the Eight- 
Year Study under the titles of Interpretation of Data, Applying Princi- 
ples and Generalizations, and Nature of Proof. Attitudes or beliefs 
may be measured by the series of attitude scales devised by H. H. 
Remmers and his associates, Beliefs on Social Issues devised by the 


staff of the Eight-Year Study, or the Scale of Civic Beliefs by Wright- 
stone. 


4. Construction of Needed Test Scales or Techniques 


For some of the objectives, standardized or published tests, scales, 
or other techniques will not be available. It is essential, therefore, to 
begin construction of necessary measurement techniques which will 
help to appraise growth and development in such objectives. For 
example, very few published tests are available to measure growth and 
development in appreciations or in creative expression, and the staff 
of a school might work cooperatively with test technicians in order to 
. develop the needed scales or techniques to permit measurement of 

pupil progress in these areas. P 

In one large city, as an illustration, a committee of mathematics 
teachers cooperated with school psychologists and research personnel 
in the construction of tests covering arithmetic computation and arith- 
metic judgment. Another committee of teachers cooperated in the 
construction of a battery of tests to measure critical thinking in chem* 
istry. Several teachers from one high school cooperated in the con- 


Principles, Scope, and Methodology of Evaluation 21 


struction of scales of attitudes to measure the effectiveness of a cur- 
riculum stressing intercultural relationships. 

Another committee of teachers from several vocational high schools 
cooperated in the construction of a battery of tests for courses in each 
of the following areas of vocational education: automotive trades, 
machine shop trades, electrical trades, trade dressmaking, commercial 
education, and cosmetology. For each trade, the battery included tests 
for trade knowledge required in the particular industry, trade per- 
formance, related drawing, related mathematics, and related science. 


5. Application of the Various Formal and Informal Tests and Tech- 

niques to the Appraisal of Individual Growth and Development 

The final step in the process of evaluation is to apply the various 
formal and informal tests and techniques in order to make judgments 
about individual growth and development in each of the major ob- 
jectives. This application and the subsequent interpretation of the re- 
Sults will permit teachers and supervisors to guide the growth and 
development of each pupil to the best of his individual capacities, 
abilities, and goals. It will also permit the teachers and supervisors 
to judge the effectiveness of the curriculum and the instructional tech- 
niques that are being used, and to make desirable modifications in 
them. Of these two major purposes the more important is to provide 
Some data or evidence that will permit the teacher to guide the indi- 
vidual more wisely than would be possible without such information 
about his growth and development over a period of time. 


Characteristics of an Adequate Program of 
Evaluation 


A modern program of evaluation can be recognized by certain 
characteristics. Four questions may be applied to judge the adequacy 
of a program of evaluation in a modern school: 


1. Is the Design of the Evaluation Program Comprehensive? 

The major objectives to be appraised should include not only con- . 
Cepts, skills, and knowledges, but also appreciations, attitudes, inter- 
ests, critical thinking, and personal-social adaptability. Such a design 
of evaluation is comprehensive when it includes the major values or 
Objectives that the modern school attempts to achieve for each indi- 
Vidual pupil. The teacher must regard his task as guiding the pupil’s 
growth and development not only in academic subjects, but also in 
the more intangible areas of interests, attitudes, appreciations, and 


22 Nature and Scope of Evaluation 


emotional and social adjustment. His concern is with the whole child 
and not merely with the intellectual or academic growth of the child. 
This is a hallmark of the modern design of evaluation. 

One must remember, however, that valid and reliable techniques 
of measurement or appraisal are by their very nature restricted to an 
evaluation of major aspects of learning—concepts, skills, attitudes, 
thinking, and personal-social relationships. It is impractical to attempt 
to measure the whole result of an educative experience because it in- 
volves expending an unreasonable amount of time and energy. 

The types of behavior measured in a comprehensive evaluation of 
pupil growth represent a sampling, not the total of all behavior that 
might be measured. This sampling should be sufficiently comprehen- 
sive to give an accurate appraisal of a pupils over-all growth and 
development, provided the design of the evaluation includes the major 
objective of the course or curriculum. Likewise, the items of a test or 
scale represent a sampling, not the total of the concepts, information, 
attitudes, interests, or powers of thinking acquired by a pupil. This 
sampling, however, should be sufficiently comprehensive to give an 
accurate index of a pupil's status and growth. 

For certain of the objectives of the educative process, fairly satis- 
factory tests and measures have been devised. This is especially true 
for skills, concepts, and information. To some degree, it is true for 
various aspects of critical thinking, such as interpreting data, apply- 
ing principles, and identifying faults in thinking or logic. In physical 
health, fairly satisfactory tests and measures have been devised for 
important areas of growth. In mental health, in appreciations and 
creative expression, and in interests and attitudes, our present meas- 
ures are less well developed and need considerable refinement. Such 
measures, however, despite their technical faults and shortcomings, 
do provide some evidence that is valuable in judging the growth of 
the individual and in guiding his future growth and development. 
Some of these measures may be rather subjective, for example, rating 
scales, anecdotal records, essay examinations, and the like. As more- 
technical advances are made, these instruments will be improved. 


The fact that they are now only tentative, however, should not dis- 
count their present values for the teacher. 


2. Are Changes in an Individual's Behavior the Basis for Evaluating 
His Growth and Development? 


The total behavior of the individual—intellectual, physical, emo- 
tional, and social—should be the concern of the teacher and the super- 


Principles, Scope, and Methodology of Evaluation 23 


visor in every learning situation. When the child is learning arithme- 
tic, or science, or history, he is at the same time learning attitudes, 
developing interests, and making emotional and social adjustments. 
If he is frustrated by too difficult tasks, or if he is bored by too easy 
tasks, then his attitudes and emotional and social adjustments will be 
adversely affected in the learning situations. The teacher, therefore, 
must remain aware of the various aspects of a pupil's behavior, even 
though the major purpose of a particular learning experience may be 
to master the formula for finding the area of a rectangle, or to recog- 
nize the chemical symbol for salt. Every learning situation includes 
multiple learning, involving not only intellectual concepts and skills 
but also physical, emotional, and social adjustment. The total behavior 
of the child, therefore, is affected to some degree by every learning 
experience. If the curriculum is designed with the broad objectives 
in mind, it follows that pupil behavior will be evaluated in terms of 
these multiple objectives or values. 


3. Are the Results of Evaluation Organized and Integrated Into a 
Meaningful Interpretation? 


The quantitative and qualitative results stemming from an evalua- 
tion program should be summarized into a meaningful pattern of 
Scores—statistical, graphic, and verbal-so that a portrait of the indi- 
vidual may be gained and compared with previous portraits. The di- 
rections and areas in which he is growing, as well as the rate of 
growth, should be evident. In this interpretation, an effort should be 
made to see the relationship between the scores on tests and such 
qualitative entries as anecdotal records, so that the total growth and 
development of the pupil may be guided wisely. 

This implies that data about physical health, emotional and social 
adjustment, interests, attitudes, and the results from achievement tests 
in various subjects will not be treated as separate and unrelated enti- 
ties, but will be correlated or integrated into a unified description of 

_ the individual. A hypothetical and simple illustration can be cited. 


Data from a cumulative record card show that James 
Morris has good physical health and vitality, except that his 
vision has been impaired so that this year he needs glasses 
for reading. His emotional and social adjustment was slightly 
less favorably rated than in previous years, perhaps because 
he was disturbed by his less successful achievement in sub- 
jects requiring reading—languages, history, and science. His 
interests are mainly academic. His attitudes are rather liberal 


24 Nature and Scope of Evaluation 


and broadminded. His academic achievement, except in 
mathematics, is slightly below previous years, but it has im- 
proved since he obtained glasses. 


4. Is the Evaluation Program Continuous and Interrelated with the 
Curriculum? 


In the modern school, evaluation is considered an ongoing process. 
Day-by-day observations, ratings, and tests should constitute the ap- 
praisal procedures by which the teacher attempts to evaluate and 
guide the pupil's growth. This is a concept different from the older 
approach, which considered testing as an end product rather than as 
a means for guiding growth, and looked upon measurement as an end- 
of-term activity. 

An evaluation program is interrelated with the curriculum because 
it is an integral part of guiding pupil experiences. The tests, question- 
naires, and other instruments by means of which evidence is gathered, 
provide the basis for judging growth toward the curriculum objectives. 
The evidence which is gathered for individuals or for groups in turn 
affects the curriculum by indicating those areas in which pupil 
achievement is not as effective as may be desirable, and by indicating 
those activities and experiences which may not be as conducive to 
pupil development as others. Thus, the evaluation program becomes 
a means not only for guiding the pupil's growth, but also for judging 
changes that may be necessary in the design of the curriculum and in 
the conduct of instructional practices. Evaluation also throws a further 
light on the meaning and re-definition of objectives of the curriculum. 
It helps in any refinement or modification that may be desirable in the 
objectives that are set up for the curriculum. 


The Use of Evaluative Techniques in School Systems 


One of the most systematic surveys of evaluation programs in action 
was conducted by Michaelis and Howard (3). The purpose of their 
survey was to review current practices in programs of evaluation in 
unified city school districts in California. Thirty-eight city school sys- 
tems provided data for the study. It seems reasonable to assume that 
the conditions and practices discovered in California prevail in mod- 
ern school systems in other parts of the nation. 


Among the major findings of the study are the following: Thirty- . 


two per cent of the school systems provide handbooks to guide the 
teachers and to keep them informed about the program of evaluation. 


Principles, Scope, and Methodology of Evaluation 25 


These handbooks generally have sections on the relation of evaluation 
to the educative process, purposes of evaluation, basic needs and dif- 
ferences of children, guiding principles of evaluation, and practical 
suggestions for using such informal devices as observations of behav- 
ior, anecdotal records, sociograms, interviews, checklists, rating scales, 
and inventories. i 

Of particular interest is the summary table of data (3:252) indi- 
cating the number and percentage of school systems using various 
evaluative devices. One hundred per cent of the school systems, for 
example, use psychological and educational tests, but only about ten 
per cent use sociograms. More detailed information is provided in the 
table below which is quoted from the published study. 


California School Systems Reporting 


TABLE 1 
the Use of Various Evaluative Devices 
EVALUATION DEVICE SCHOOL SYSTEMS USING 
Number Per Cent 
Tests - 38 100.0 
Interviews 34 89.5 
Case Studies 32 84.2 
Case Conferences 31 81.6 
Group Discussion 26 68.4 
Anecdotal Records 24 63.2 
Observation 23 60.5 
Files of Sample Materials 22 57.9 
Questionnaires 21 55.8 
Rating Scales 17 44.7 
Check Lists 14 36.8 
Inventories 12 31.6 
Logs 5 13.2 
Diaries 5 13.2 
Sociograms 4 10.5 


Michaelis, John U., and Howard, Charles, “Current Practices in Evaluation in City 
School Systems in California,” Journal of Educational Research. 


In addition to the devices listed in Table 1, the authors report that 
individual school systems, indicated the use of follow-up studies, auto- 
biographies, clinics, social case work, evaluative criteria, stenographic 
reports, recordings, interaction content records, photographs, movies, 
and pupils’ graphs. 


7 

26 Nature and Scope of Evaluation 

From the data reported in the study, it is evident that the school 
systems generally are using a wide range of techniques and methods 
to evaluate the growth and development of pupils. It is significant 
that more than fifty per cent of the school systems are using not only 
tests of intelligence and achievement, but also interviews, case studies, 
anecdotal records, observational methods, files of sample materials, 
and questionnaires to evaluate the development of various aspects or 
objectives of learning and instruction, It is equally interesting to note 
that even such new techniques as logs, diaries, and sociograms are be- 
ing used in slightly more than one out of every ten school systems, a 
significant comparison with the practices of ten years ago. The trend 


represents a distinct movement toward more comprehensive evalua- 
tion programs, 


Summary 


A comprehensive evaluation program for a school or school system 
requires careful planning and effective administration. The steps in 
careful planning of an evaluation program are: First, formulation of 
the major objectives of the curriculum. These objectives become 
guideposts in both curriculum development and evaluation. The for- 
mulation of the particular pattern of objectives for a school may be 
accomplished by (1) the curriculum analysis method, wherein the 
general purposes of a curriculum are analyzed into relatively inde- 
pendent objectives, (2) the conference method, in which committees 
are set up to indicate those objectives to be realized, or (3) the ques- 
tionnaire and interview method, wherein a questionnaire may be used 
with students, groups, or patrons of a school, and an interview fol- 
low-up made to check the validity of the questionnaire in determin- 
ing the purposes and aims for the curriculum. 

The second step in planning a program is the definition and clarifi- 
cation of the major objectives. These definitions should outline more 
or less specifically the skills, abilities, understandings, attitudes, and 
interests that are to be achieved. The third step is the selection of 
available tests or measures and the determination of the appropriate- 
ness of the selections for each of the major objectives as outlined. The 
fourth step is the construction of tests, scales, or techniques needed to 
evaluate those objectives for which no standardized or published 
measuring devices are available. The fifth step in the evaluation proc- 
ess is the application of the various formal and informal tests and 
techniques in order to make judgments about individual and group 
growth and development in each of the major objectives. 


Principles, Scope, and Methodology of Evaluation 27 


The characteristics to be used for appraising the adequacy of a 
program of evaluation in the modern school may be embodied in the 
four questions which follow: 


1. Is the design of the evaluation program comprehensive, so 
that it includes not only abilities, skills, and understand- 
ings, but also the less tangible objectives of learning and 
instruction? 

2, Are changes in the behavior of the individual the basis 
for evaluating his growth and development, since the 
total behavior of the individual—mental, physical, emo- 
tional, and social—should be the concern of the teacher 
and supervisor in every situation? 

8. Are the results of the evaluation organized into a mean- 
ingful interpretation so that a portrait of the individual's 
growth and development and the interrelationships of 
such growth become evident? 

4. Is the evaluation program continuous and interrelated 
with curriculum development? 


Surveys of evaluation programs in schools reveal that, compared 
with practices of ten years ago, the modern school or school system is 
using a wider variety of evaluation techniques or devices to assess 
Pupil growth and development. Practically every school system uses 
tests of intelligence and achievement, but, in addition, over eighty 
Der cent use interviews, case studies, and case conferences. Approxi- 
mately two out of every three schools use group discussion and anec- 
dotal records, as well as observational methods. About one out of 
every two schools uses files of sample materials, questionnaires, and 
rating scales. One out of every ten schools uses logs, diaries, and socio- 


grams. 


Problems for Class Discussion 


l. Formulate the major objectives of a course of study or a subject in which 
you are interested and define each major objective briefly in terms of 
pupil behaviors that would indicate growth in the objective. 

2. For a course in secondary-school science or social studies, list at least 
one available test or measure for each of the following: mastery of in- 
formation, attitudes, critical thinking, and interests. 

8. Using an adaptation of the list of evaluation devices in Table 1 in this 
chapter, make a survey of the teachers in one or more schools to de- 
termine which devices are used most frequently. 


28 Nature and Scope of Evaluation 


References Cited in This Chapter 


1. Eurich, Alvin C., and Pace, C. Robert, A Follow-Up Study of Minne- 
sota Graduates from 1928 to 1936. Minneapolis: Univ. of Minnesota, 
1938. 

2. Eurich, Alvin C., “A Plan for Evaluation for Bennington College,” Jour- 
nal of Educational Research, 34:633-634, April, 1941. 

8. Michaelis, John U., and Howard, Charles, “Current Practices in Evalua- 
tion in City School Systems in California,” Journal of Educational Re- 
search, 43:950—260, December, 1949. 

4. Raths, Louis E., “Basis for Comprehensive Evaluation,” Educational 
Research Bulletin, 15:220-224, 1936. 

5. Smith, Dora V., *Recent Procedures in the Evaluation of Programs in 
English," Journal of Educational Research, 38:262-275, December, 
1944. 


. Tyler, Ralph W., Constructing Achievement Tests. Ohio State Univ., 
1934. 


7. Tyler, Ralph W., “Defining and Measuring Objectives of Progressive 
Education," Educational Research Bulletin, 15:67-71, 1936. 

. Wrightstone, J. Wayne, Appraisal of Experimental High School Prac- 

tices. Bureau of Publications, Teachers College, Columbia Univ., 1936. 

- Wrightstone, J. Wayne, Appraisal of Newer Elementary School Prac- 

tices. Bureau of Publications, Teachers College, Columbia Univ., 1938. 


10. Wrightstone, J. Wayne, “Measuring Some Major Objectives of the Social 
Studies,” School Review, 48:771-779, 1935. 


References for Further Reading 


Smith, Eugene R., Tyler, Ralph W., and others, Appraising and Recording 
Student Progress, Chapter I, p. 3-34. New York: Harper & Brothers, 1942. 


Chapter I defines major purposes of evaluation, basic assumptions, and 
such general procedures in developing the evaluation program as formu- 
lating objectives, classification of objectives, defining objectives in terms 
of behavior, selecting, trying, and improving evaluation methods, and 

> interpreting results. 


Wrightstone, J. Wayne, Appraisal of Newer Elementary School Practices, 
Chapters VI-VIII, p. 118-165. New York: Bureau of Publications, 
Teachers College, Columbia Univ., 1938. 

Chapters VI-VIII discuss formulating cardinal objectives of elementary 


education, assumptions underlying the newer practices, and an inclusive 
plan of appraisal. 


Types, Uses, and. Qualities 
of Major Evaluation Techniques 
/ 


CHAPTER THREE 


In a comprehensive program of appraisal in the mod- 
ern school, objective tests constitute one of the major techniques of 
evaluation. Other major techniques by which data for assessing pupil 
growth and development may be obtained include, as previously 
noted: anecdotal records and observational methods; oral and essay 
examinations; questionnaires, inventories, and interviews; checklists 
and rating scales; personal reports and projective techniques; socio- 
metric methods; and, case studies and cumulative records. 

For each of these evaluation techniques, a variety of uses may be 
discerned in current practice. These include interpretation and appli- 
cation of data and results for (a) administrative uses, (b) instruc- 
tional uses, (c) guidance uses, and (d) research uses. These uses over- 
lap because the categories are not mutually exclusive and the same 
evaluation data may serve multiple purposes. The adequacy of each 
evaluation method should be judged on commonly accepted criteria— 
validity, reliability, objectivity, norms, and practicability-as these 
apply to the educational situation and the purposes of the educators. 


Classification and. Definition of Major Evaluation . 
Techniques 


For convenience in discussion, the large number of specific evalua- 
tion techniques, methods, and devices have been classified into cate- 
Eories. While it may be argued that these categories are somewhat 
arbitrary, any other classification scheme would be equally dependent 
upon the purposes and logic of an author. The present classification 
follows closely the discussion in Part Two of this volume, in which the 


various methods are described and illustrated. 
29 


30 Nature and Scope of Evaluation 
OBJECTIVE TESTS 


Psychological tests, variously termed mental or educational tests, are 
designed to present an organized series, or pattern, of items, exercises, 
or stimuli to elicit responses which will reveal the relative degree of 
the psychological characteristics possessed by the individual to whom 
such tests are administered. Among the psychological characteristics 
frequently tested are general mental ability, achievement in particular 
subjects, reading abilities and skills, motor coordination, and aptitude 
for music or art. Other types of measures frequently included under 
the category of psychological tests are scales or inventories of atti- 
tudes, interests, and personality. The responses required in tests may 
be selecting the one word or statement among several which best an- 
swers a question, filling in by word or phrase an incomplete sentence 
or statement, the drawing of a simple figure, or marking each of a 


series of beliefs or opinions to indicate agreement, uncertainty, or dis- 
agreement with a statement. 


CLASSIFICATION OF OBJECTIVE TESTS 
BY PSYCHOLOGICAL CHARACTERISTICS 


A variety of schema may be used to classify objective tests: the 
psychological characteristics measured, the purposes of the test, the 
content of the test, or the technical features of the test. For purposes 
of discussion the chart on page 31 presents a classification of objective 
tests according to the psychological characteristics that they purport 
to measure. 

It will be noted in this illustrative chart that the general psycho- 
logical characteristics by which objective tests have been classified 
include those terms commonly used in educational practice. Among 
these are tests of intelligence or mental ability; achievement as meas- 
ured by mastery of abilities, skills, and concepts in the various school 
subjects of reading, mathematics, language arts, social studies, and 
Science; performance tests and aptitude tests for art, music, language, 
and mechanical fields. Included under the category of tests are scales, 
inventories or questionnaires of attitudes, interests, and personality. 
The second column in the chart illustrates titles of some tests of more 
specific characteristics. Thus, intelligence has been broken down into 
tests of verbal intelligence and tests of non-verbal intelligencezI ésts 
of achievement have been broken down into tests of reading compre- 
hension, vocabulary, work-study skills, and tests of critical thinking. 
Mechanical aptitude tests have been broken down into tests of manual 
dexterity and tests of spatial relationships. 


Types, Uses, and. Qualities of Major Evaluation Techniques 31 


It is not intended that the list of test titles in the chart should be 
inclusive. Rather, the intention is to illustrate some of the types of 
tests available when these are classified by both general and specific 


psychological characteristics. 


Chart Illustrating Classification of Objective Tests 


According to Psychological Characteristics 


GENERAL CHARACTERISTICS 
Intelligence or mental ability 


Achievement, or abilities, skills, and 
concepts in reading, mathematics, 
language arts, social studies, sci- 
ence, etc. 


Aptitude—art, music, language, me- 
chanical 
Attitudes or beliefs * 


Interests ° 


Personality ° 


SPECIFIC CHARACTERISTICS 


Verbal intelligence 
Non-verbal intelligence 


Reading comprehension 
Vocabulary 

Computational ability 
Problem-solving ability 

Critical thinking 

Work-study skills 

Trade knowledge and performance 


Rhythmical discrimination 

Tonal memory 

Manual dexterity 

Spatial relationships 

Attitude toward minorities, the 
Church, etc. 

Civic beliefs 

Militarism-pacifism 

Vocational interests 

Reading interests 

Interest in extracurricular activi- 
ties 

Ascendence-submission 

Neurotic tendencies 


* Also referred to as scales, inventories, and questionnaires. 


CLASSIFICATION OF OBJECTIVE TESTS BY TECHNICAL FEATURES 


In addition to the classification of tests by psychological character- 
istics, it may be valuable to summarize briefly the classification of 
objective tests by such technical features as format and construction, 
standardization, use of results, and procedures for administration. 


Some of the major classifications according to this logic are presented 


in the chart on the following page. 


32 


Chart Illustrating 


Nature and Scope of Evaluation, 


Classification of Tests by Technical Features 


INDIVIDUAL 
Most individual tests require oral 
questioning and observation of re- 
actions of the individual to test 
exercises. Examples: Individual 
Binet Examination of Intelligence or 
Rorschach Test. 


SUBJECTIVE 
In subjective tests judgment and 
opinions of scorers affect score as- 
signed to test responses. Examples: 
Essay examination or rating scale 
of personality. 


STANDARDIZED 
Require uniform test content and 
testing procedures; each usually 
has norms, permitting comparisons 
with scores of the normative popu- 
lation. Examples: Pintner, Otis, or 
California Tests of Mental Ability. 


SPEED 
Speed tests require the individual 
to complete as many problems or 
tasks as he can at maximum speed. 
Examples: Purdue Peg-Board Test 


versus 


versus 


versus 


GROUP 
Most group tests permit many indi- 
viduals to be tested at same time in 
a group situation. Examples: Stan- 
ford Achievement Battery or Kuder 
Preference Record. 


OBJECTIVE 

In objective tests a standard and 
specific scoring key provides for 
assigning a uniform score by com- 
petent scorers to test responses, Ex- 
amples: California Reading Test or 
Differential Aptitude Battery. 


INFORMAL 
Informal tests are usually teacher- 


made for a particular unit of study 
and do not have norms. 


versus 


POWER 
Power tests usually present a series 
of problems or tasks of increasing 
difficulty, allowing reasonable time 
for individual to complete as many 


or Speed of Reading Test. as he can. Examples; Cooperative 
Reading Test or Henmon-Nelson 
Test of Mental Ability. 
PERFORMANCE versus 


Performance tests require demon- 
stration of skill by manipulating ob- 
jects or apparatus. Examples: Form 
board tests or typewriting tests. 


SURVEY 


In survey tests, emphasis is usually 
upon a general and undifferentiated 
measure of ability or skill. Ex- 
amples: Batteries of Achievement 
Tests, 


versus 


PENCIL-AND-PAPER 

These tests require answering 
questions by drawing or writing, 
Examples: Interest Inventory, Men- 
tal Ability Test, or Attitude Scale. 


DIAGNOSTIC 


In diagnostic tests various subtests 
and interpretation of their results 
permit identification of more spe- 
cific abilities and disabilities, Ex- 
amples: Diagnostic Reading Test 


or Diagnostic Arithmetic Test. 


A 


Types, Uses, and. Qualities of Major Evaluation Techniques 33 


Objective tests may be classified in a number of ways. In this brief 
introduction, tests have been classified in two categories, first, by the 
psychological characteristics they purport to measure; second, by tech- 
nical features. Specific tests will be described and illustrated in sub- 
sequent chapters of this volume. Brief descriptions of other evaluation 
techniques used in comprehensive programs of appraisal are presented 
in the following paragraphs. 


ANECDOTAL RECORDS AND OBSERVATIONAL TECHNIQUES 


Anecdotal records and observational techniques are classified under 
the same category because of the similarity in the methods of obtain- 
ing data. Both methods depend chiefly upon an observer who records 
the activities, experiences, and expressions of individuals or groups. 
These techniques are discussed in more detail in Chapter Seven. 

Anecdotal records may be defined as cumulative notes or records 
which a teacher or observer makes of the representative behavior of 
selected pupils. In compiling the anecdotal record for a pupil, the 
teacher makes notes of sample situations, activities, experiences, and 
expressions of the pupil selected for special study. The note or record 
made should be a concise description of the behavior, e.g., “John tore 
a page from Roy's book," and not the teacher's interpretation of this — 
behavior, e.g., "John was angry and irritable." In order to keep the 
task of observing and recording from becoming too burdensome, the 
observations should be restricted to three or four important behavior 
characteristics and to only a few pupils. The anecdotal record illus- 
trates an informal method of teacher evaluation of personal and social 
characteristics of selected pupils. It is one of the practical methods 
available to the teacher for studying pupil adjustment. 

Observational techniques may be defined as systematic methods of 
analyzing and recording behavior by directly perceiving the individ- 
ual or group. The method is generally characterized by observing 
what the individual actually does and making an objective record of 
that which is observed. The method may use special techniques and 
tools, such as specially prepared charts or checklists for recording the 
behavior. Observational techniques may be roughly classified under 
those involving structured or controlled observation and those involv- 
ing unstructured observation. Under structured observation, the ob- 
server may record the occurrence of simple expressive behavior, such 
as smiling, laughing, physical or verbal contact with another individ- 
ual, or complex behavior and occurrence of acts, such as cooperation 
and initiative. Observation is also used in performance tests involving 


34 Nature and Scope of Evaluation 


structured work samples, such as performance on a form board test or 
performance on a series of operations on an auto-mechanics test, in- 
cluding changing a tire, checking a battery, and aligning a wheel. 
Informal, unstructured observations of individuals at work occur daily 
in a classroom, as when the teacher watches the pupil performing 
operations in arithmetic exercises or participating in an informal social 
group. 


ORAL AND ESSAY EXAMINATIONS 


Oral and essay examinations are frequently used by the classroom 
teacher as informal methods of assessing and diagnosing the day-by- 
day growth and development of pupils. Both the oral and essay ex- 
amination, however, have also been designed for use in more formal 
and structured types of exercises. These techniques are discussed in 
greater detail in Chapter Six. 

Structured types of oral tests include a standardized test of oral 
reading and oral trade tests. Employment agencies and others involved 
with industrial personnel use the oral type of trade test frequently. 
Such tests consist of a series of questions which it is believed can be 
answered only by those who know and understand the particular 
trade or occupation for which the test is designed. A test for general 


automobile mechanics, for instance, includes such questions as the 
following: 


What are steel power bushings made of? 
What joint is there between the differential and the trans- 
mission? i 

Unstructured oral tests include those commonly used by teachers in 
the classroom wherein an individual or a group is asked oral ques- 
tions to which they may respond either orally or in writing, thus per- 
mitting the teacher to assess their mastery of subject matter or their 
attitudes, interests, or powers of thinking. In a like manner, the teacher 
may ask a pupil to read orally a selection from a book or text in 
order to determine his proficiency in oral reading. 

The essay type of examination may be defined as a relatively free 
written response to a problem situation or situations in which the 
written answer, intentionally or unintentionally, reveals evidence re- 
garding the functioning of the pupil's mental powers as they have 
been modified by a particular set of learning experiences. This type 


of test is widely used by classroom teachers, especially in secondary 
schools. 


— — —— — A 


Types, Uses, and Qualities of Major Evaluation Techniques 35 


QUESTIONNAIRES, INVENTORIES, AND INTERVIEWS 


Questionnaires and inventories are methods designed to obtain in- 
formation from individuals by means of a series of questions to which 
the responses are written. The interview is designed to obtain data in 
a face-to-face relationship. These techniques are discussed in more de- 
tail in Chapter Eight. 

The questionnaire may be defined as a list of planned written ques- 
tions that are related to a particular topic or series of topics. Space is 
provided for indicating the response to each question. The question- 
naire is intended for submission to a number of persons for reply. 
It is commonly used in obtaining specific facts and data and in the 
measurement of attitudes and opinions. The structured type of ques- 
tionnaire is prepared in such a manner that the answers may be 
checked or underlined by the respondent. The unstructured type of 
questionnaire is sometimes called the “open-end” questionnaire. The 
individual makes a free response to the question rather than selecting 
one of several answers which have been presented. The inventory is 
similar to the questionnaire, except that it has, in actual practice, 
been defined as a structured test or checklist used to determine the 
examinee’s attitudes, opinions, or feelings. 

The interview is a method for obtaining data by face-to-face con- 
ference with an individual. The interviewer may employ observation 
checklists and rating techniques as a part of the interview. Inter- 
views may be classified into three general types. The first type is the 
diagnostic interview to discover detailed and related facts, opinions, 
attitudes, and personal experiences about the individual being inter- 
viewed. The second type is the survey interview in which the interest 
is not primarily in the person except as he can contribute an opinion 
or fact about a problem. A good example of this type is the interview 
used in connection with the public opinion poll in which the interest 
is to obtain data about a social, political, economic, or educational 
problem from a sampling of a population. The third type of inter- 
view may be called a treatment interview. This type is used mainly 
to help an individual to adjust to some particular problem or situation. 


CHECKLISTS AND RATING SCALES 

The checklist may be defined as a prepared list of items that may 
relate to a person, procedure, institution, building, or similar object. 
The list is used for the purpose of observation and evaluation by 
which the observer may show by check marks the presence, absence, 
Or frequency of occurrence of each item being investigated. The 


36 Nature and Scope of Evaluation 


diagnostic checklist, for example, is a device used in the classroom or 
clinic to aid in determining the deficiency of a pupil in reading, in 
arithmetic, or other fundamental skills. The major difference between 
checklists and rating scales is that the checklist requires the checking 
of specific items rather than the judgment of the degree to which a 
characteristic or the behavior item is present. 

Rating scales or rating methods are devices for the systematic re- 
cording of observations and judgments on a scale of units or values 
given to objects such as school buildings, textbooks, specimens of 
handwriting, or to persons, such as pupils or teachers, or to person- 
ality traits and attitudes. There are five major types of rating scales 
as follows: (a) The descriptive rating scale, which contains a num- 
ber of phrases defining varying degrees of a trait or characteristic; 
(b) The graphic rating scale, in which a scale of units is indicated on 
a line and the observer records his rating by checking a point on the 
line; (c) The product scale, consisting of a series of products, for 
example, specimens of handwriting arranged according to values de- 
termined by a jury. A product is rated by matching it with the scale 
products; (d) The man-to-man rating scale consisting of descriptions 
of a number of persons, each of whom is known to the user of the 
scale, representing the highest, lowest, and intermediate degree of 
merit for rating other individuals; and (e) The numerical rating scale, 
frequently called a score card, consisting of a number of items or 
characteristics each of which has been assigned a numerical value. 
Score cards have been constructed for rating school buildings, ele- 
mentary-school practices, teaching ability, and courses of study. 


Checklists and rating scales are discussed in more detail in Chapter 
Nine. 


PERSONAL REPORTS AND PROJECTIVE TECHNIQUES 


Personal reports and projective techniques have been used very 
widely to study the personality of pupils. Personal reports are essen- 
tially self-ratings and are frequently called personality tests or in- 
ventories. An example of a self-descriptive personal report is the 
California Test of Personality. On such an inventory, the pupil an- 
swers or checks “yes” or “no” to such items as “Do you daydream?” 
or “Do you have many friends?” Variations of the personal report in- 
clude autobiographies or interviews, in which the individual reports 
on his personal history, likes, dislikes, goals, and problems. 

In recent years, projective methods have been used increasingly to 
assess the personality of individuals. A projective method for the study 


Types, Uses, and Qualities of Major Evaluation Techniques 37 


of personality involves the presentation of a stimulus situation de- 
signed to elicit responses which will reveal the personal aims and 
projections of the examinee. The Rorschach, one of the most widely 
known and used projective methods, uses ink blots as the stimulus 
for eliciting responses or projections of the examinee. Pictures, such 
as those devised by Murray in his Thematic Apperception Test, are 
used in a similar manner. In addition to these methods, other pro- 
jective techniques include analysis of handwriting, painting, or draw- 
ing, completion of sentences or stories, role-playing, and the use of 


toys or puppets. 
These methods are discussed in more detail in Chapter Ten. 


SOCIOMETRIC METHODS 

Sociometric methods may be defined as devices for revealing the 
preferences, likes, or dislikes that exist among the members of a group. 
This technique is characterized by the procedure of obtaining from 
individuals in a social unit a statement as to which group members 
ferred as cooperating participants in various activities or 
relationships. An individual may be asked, for example, to name the 
person that he would wish as a workmate, as a seatmate, or as a team- 
mate in a school situation. This technique may be used for revealing 
the group structure of a social unit and identifying subdivisions of the 
group. It permits an analysis of various types of group members as, 
for example, leaders, isolates, and those in rival factions or cliques. 


This technique is discussed in Chapter Eleven. 


would be pre 


CASE STUDY 


The case study or case method may be defined as the use of com- 


prehensive data about an individual as the basis for diagnosing and 
interpreting his conduct or behavior. It is a method of investigation 
that concerns itself with the careful examination of factors that are 
significant in the life of the person under study. Emphasis is placed 
on discovering what is unique in the case rather than what is char- 
acteristic of large numbers of individuals. The findings are especially 
related to the treatment of maladjustments displayed by the indi- 
vidual. It is a diagnostic and remedial procedure based on a thorough 
investigation of a person in order to acquire knowledge of his history, 
his home conditions, and all of the things that may have contributed 
to his behavior difficulties. The ultimate aim of the case method is to 
diagnose and evaluate the total behavior of the individual in order 
that appropriate remedial measures may be applied. The case study 


38 Nature and Scope of Evaluation 


generally contains a description of the behavior indicative of malad- 
justment, a physical and health report based on an examination, family 
background and history, early childhood history, mental capacity and 
educational achievements as revealed by tests and observational 
methods, a report on the child's personality, and a description of 
previous efforts to improve the adjustments. See Chapter Twelve. 


CUMULATIVE RECORD 


The cumulative record is an individual record, usually of a perma- 
nent nature, that is kept up to date by the teacher or other school 
personnel. It may be in the form of a card, folder, or packet. A cumu- 
lative record, regardless of its format, is an educational history con- 
taining fairly complete information about the pupil's school achieve- 
ment, course of study, attendance, health, and similar pertinent data. 
The record folder or record packet is frequently used as a con- 
venient filing device for accumulating data over a period of time. 
This may contain some actual samples of the pupil's writing, draw- 
ing, and school work, as well as anecdotal records and other recorded 


observations made by teachers and school personnel. This technique 
is discussed in Chapter Thirteen. 


Uses of Evaluation Data 


The proper use of data from evaluation techniques requires that 
one or more important and desirable purposes shall be served. In- 
telligent planning is basic to the wise use of information obtained 
from various methods of evaluation. One way of classifying the uses 
of various measures is according to the functions of various school 
officers or personnel, namely, the administrator, the supervisor, the 
teacher, the guidance counselor, and the research worker. Several 
purposes may be satisfied by the same data. Thus, the data from an 
intelligence test, an achievement test battery, an interest inventory, 
or a sociometric technique may be used by school personnel for dif- 
ferent purposes, as illustrated in the following paragraphs. 


ADMINISTRATIVE USES 


The administrator may use results of evaluation methods to provide 
records of pupil adjustment, interests, aptitudes, and achievement.) 
The data may be entered upon the pupil’s cumulative record card or 
in the cumulative folder and become a basis for the evaluation of the 
individual’s growth and progress or that of the class group) Another 


Types, Uses, and. Qualities of Major Evaluation Techniques 39 


use by the administrator is to provide reports to parents. )Frequently, 
the principal may find it necessary and desirable to supplement his 
opinions or the teacher's opinions about a pupil by documentary evi- 
dence gathered by means of tests, questionnaires, interviews, or anec- 
dotal records. Such evidence may frequently be used in reports to 
or in conferences with, parents. (A third use is to make available more 
systematic and objective records when a pupil is transferred to an- 
other school. Such records permit a better interpretation of a pupil's 
status and facilitate placement in a congenial classroom in the new 
school) The administrator may also use evaluation data to provide 
periodic reports of school progress to the patrons in a community. 
Frequently, also, the data from sociometric, personality, aptitude, and 
achievement tests may be consulted in the classification of pupils for 
instructional purposes. Although learning is mainly an individual mat- 
ter, the usual classroom situation demands that each pupil learn as 
part of a group. Data from evaluation techniques are especially valu- 
able in educational and vocational guidance to help counsel and assign 
pupils to such different curricula as college preparatory, trade, busi- 
ness, and general courses. 


INSTRUCTIONAL USES 

The supervisor, likewise, may use evaluation results for a variety 
of purposes. His major task is to help the teacher do a better teaching 
job. This responsibility can best be realized if both the teacher and 
the*supervisor have evidence about the status of the pupil as well as 
about his needs and interests. One of the uses that a supervisor may 
have for evaluation data is to determine the status of a class or a 
pupil in some of the major objectives of the curriculum. This will 
permit him to evaluate teaching methods and instructional materials, 
and to indicate desirable changes in instructional procedures and 
pupil-teacher relationships. Again, this may be accomplished by ob- 
taining evidence of the relative contribution which a particular teach- 
ing method or particular instructional materials make to the pupil's 


growth and development. 


The teacher uses test and other evaluation data for a variety of 


purposes, many of which are similar to those of the administrator 
and supervisor. The teacher and supervisor often discuss quantitative 
and qualitative data in order to arrive at an agreement on various 
instructional and learning problems. The teacher may use the results 
of tests and measures: (1) to determine the status of each pupil in 
various subjects and in arious objectives of the curriculum, (2) to 


40 Nature and Scope of Evaluation 


identify the gifted pupil, “normal” pupil and slow-learning pupil, 
(3) to group pupils for instructional purposes within the class, (4) 
to analyze or diagnose an individual pupil's difficulties and rate of 
growth, and (5) to determine the status of the individual or class at 
the beginning and at the end of the term. 
Such evaluation techniques as anecdotal records, observations, rat- 
ing scales, personal reports, interviews, and sociometric methods aid 
the teacher to assess and to guide more wisely the growth and devel- 
opment of pupils. If anecdotal records, observations, rating scales, and 
interview methods have been systematically used to collect data on 


personal and social adaptability, the teacher may use the results (1) ` 


to identify pupils who are well adjusted and those who are pooily 
adjusted, (2) to diagnose the probable causes or contributing factors 
for maladjustment, and (8);to set up individual and group condi- 
tions and situations to aid, whenever possible, growth toward better 
adjustment.|In a like manner, sociometric methods may be used as an 
aid to identify “leaders” and “isolates” among a pupil group and to 
establish social relationships in the classroom that will contribute to 
the maximum social development of each pupil. Interest inventories 


may be used to identify pupil interests in reading or other educa-_ 


tional and vocational activities, thus permitting the teacher to counsel 
the pupils and adapt the curriculum to their needs. Attitude scales 
may be used to discern individual or group attitudes toward minority 
groups. Although the case study is usually reserved for application 
to the seriously maladjusted pupil, the cumulative record should be 
studied as a method for evaluating and guiding the growth and devel- 
opment of every pupil. 


EDUCATIONAL AND VOCATIONAL GUIDANCE , 


Educational guidance is now considered an integral part of the 
educational program. Skilled guidance has become a part of each 
teachers responsibility to his pupils, as well as the responsibility of 
the guidance counselor. Individual needs and abilities are the bases 
of guidance. The teacher or counselor uses all pertinent data to ad- 
vise or guide the pupil in his physical, mental, emotional, and social 
growth and development. This guidance aids the pupil in selecting 
appropriate courses of study, changing his program of studies, moti- 
vating him to complete high school, selecting a college, understanding 
his interests and abilities, and improving his personal adjustment. 

Vocational guidance is especially important in the modern second- 
ary school. Advice is offered to an individual about whether or not 


Types, Uses, and. Qualities of Major Evaluation Techniques 41 


to undertake certain vocations. Pertinent data can be obtained in 
part through intelligence, achievement, and aptitude tests, recorded 
observations, anecdotal records, interest inventories, and the cumula- 
tive record. Based on this information, the pupil can be guided toward 
occupational choices adapted to his interests, aptitudes, and abilities. 
By using the data collected by means of various techniques, the 
teacher or counselor will have a more secure foundation for helping 
the pupil to plan for successful adjustment in an occupation. 


RESEARCH USES 


Data gathered by means of various evaluation techniques are also 
used for research purposes. Carefully designed studies are sometimes 
made, for example, of the effectiveness of different methods of teach- 
ing reading or teaching arithmetic or meeting the personal-social 
needs of pupils. Occasionally, more ambitious studies are designed 
and conducted to judge the effectiveness of a curriculum experi- 
ment, such as a core curriculum, These studies may be conducted 
by a research bureau of a city school system or of a college or uni- 
versity. On the other hand, individuals interested in advanced gradu- 
ate work may also undertake such studies. Other variations of re- 
search uses of evaluation techniques are diagnostic studies of learn- 
ing difficulties, age or grade placement of subject matter, curves 
icular materials, correlations among vari- 


of learning for various curri 
ous measures of aptitude, ability, and personal characteristics, and 


case studies of particular pupils. 


MISUSES OF RESULTS 
Evaluation results may sometimes be misused. If the design of the 
appraisal program is narrow and limited, the testing program may 
tend to determine the emphasis upon specific objectives of the cur- 
riculum to the detriment of others and to the detriment of desirable 
trends in pupil growth. Sometimes, test results of a very partial nature 
te the teacher's teaching ability. While 


are used to estimate or to ra 
such evidence may serve as a part of the data to be considered in 


rating teachers, few educators would defend a rating made solely 


upon this basis. Some administrators, however, misuse test results by 


employing them for purposes such as this. Reliable and valid instru- 


ments of measurement are, by their very nature, restricted to an 
appraisal of limited aspects of pupil behavior or growth. It is im- 
possible to measure the whole result of an educative experience by 
any one test or battery of tests and measures. The fact remains, how- 


42 Nature and Scope of Evaluation 


ever, that, by evaluating many important and vital aspects of experi- 
ences, appraisals may be obtained of the relative merits of diverse 
educational practices. 


Qualities for Judging an Evaluative Technique 


A test or evaluative technique is judged for its adequacy, efficiency, 
and consistency as a measuring device on the basis of commonly ac- 
cepted qualities. These qualities are validity, reliability, objectivity, 
norms, and practicability. Validity is that quality which indicates the 
relationship of a measure or diagnosis with meaningful criteria of 
learning or behavior. Some criteria may be selected to show effec- 
tiveness to predict future performance, other criteria to indicate im- 
mediate status, other criteria to establish the representative nature 
and scope of content or behavior, and still other criteria to provide 
data for supporting or rejecting some psychological theory. Reliability 
is that quality which indicates the consistency, equivalence, or stabil- 
ity of a measure that is obtained. Objectivity is that quality which 
indicates the identity or similarity of the scores or diagnoses obtained 
from the same data by equally competent scorers. A norm provides 
an average or typical value for a measure or diagnosis obtained by 
the administration of a measuring instrument to a specific population 
so that subsequent scores or measures for an individual or a group 
may be compared with the typical values of the normative popula- 
tion. Practicability is that quality which indicates the feasibility for 
the general use of a test or evaluative technique on such bases as 
cost, time required for administration, ease of administration, ease 
of scoring, and ease of interpretation of the results. 


VALIDITY 


In a test or other evaluative instrument, validity is that character- 
istic which indicates the degree to which the instrument measures or 
provides a diagnosis of the psychological characteristics that it pur- 
ports to measure. Validity is judged by the relationship between the 
measure or diagnosis and such meaningful criteria as ratings by 
teachers or performance of specific tasks. Cronbach (2) has indicated 
that the basic question in validity is how well a test or evaluative 
technique does the job that it is employed to do. Validity is not an 
absolute characteristic of an evaluative technique; it is relative to the 
purpose of the test user. The same technique may be used for several 
different purposes, and its validity may vary from high to low de- 


Types, Uses, and Qualities of Major Evaluation Techniques 43 


pending upon the purpose. In a test of arithmetic computation, for 
example, the validity of the test may be high for determining the 
present status of pupils in skills of arithmetic computation. Its validity 
may be moderate for judging aptitudes of pupils for business arith- 
metic. Its validity may be low for predicting success in the mathe- 
matical aspects of a subsequent course of study in physics. Validity, 
therefore, must be defined in terms of the purpose that is to be served 
by the particular instrument or technique employed. 

Various methods are employed to validate tests or other techniques 
of measurement, but all methods require meaningful criteria. These 
criteria may include future achievement, school marks, ratings by 
teachers or experts on pupils abilities, skills, interests, attitudes, or 
personal-social adjustment, and content analysis of courses of study or 
an area of behavior which is the goal of training and guidance. In 
addition to these methods, some measures are validated by correlation 
with other known measures or by comparison between the accom- 
plishments of widely spaced groups in the psychological function, or 
characteristic, under consideration. 

The definition of a meaningful criterion and its reliable measure- 
ment present difficulties. In some instances, the translation of success 
or performance into a measurable and unambiguous score raises a 
number of serious problems. Frequently, teachers’ grades are used as 
a criterion for validating educational tests and techniques, but these 
grades include not only an evaluation of the pupil's mastery of sub- 
ject matter but also a rating of the pupil's effort, verbal fluency, work 
habits, sociability, and other aspects of his personality. For the 
teachers' purposes in grading, perhaps, all of these factors and others 
may reasonably be included in an evaluation of the pupil. In valida- 
tion, however, the complexity of factors affecting teachers’ grades 
tends to obscure many of the relationships which exist between this 
criterion and the psychological characteristic measured by a test or 
technique. In addition, when grades over a period of several years are 
combined, the comparability of ratings by different teachers raises 
another problem of defining the criterion with reasonable precision. 
This example illustrates how difficult it is to obtain a meaningful cri- 
terion and to obtain a reliable rating for the criterion. Despite these 
limitations, criteria are defined and rated as carefully as possible. 


Kinds of Validity 
Since validity is no! 
validity may be identi 


t an absolute characteristic, several kinds of 
fied, depending upon the purpose for which 


44 Nature and Scope of Evaluation 


the evaluative method is to be used. According to a report (1) pre- 
pared by a joint committee of the American Psychological Association, 
American Educational Research Association, and National Council on 
Measurements Used in Education, four categories of validity may be 
distinguished. These are predictive validity, concurrent validity, con- 
tent validity, and construct validity. 


Predictive Validity Predictive validity is judged or estimated by the 
degree of the relationship between a measure and subsequent cri- 
terion measures or judgments. This type of validity is required in such 
measures as tests of intelligence or academic aptitudes used for pre- 
dicting later scholastic success; in tests of aptitudes used for predict- 
ing later success in some field of study or work such as music, art, or 
stenography; in vocational interest inventories used to aid in the 
choice of a vocation; and in projective techniques or'similar meas- 
ures which are used to predict the future personal-social adjustment 
of an individual. Questionnaires and interviews designed to elicit 
opinions and beliefs and which are used for predicting future be- 
havior of individuals or groups must also satisfy this type of validity. 


Concurrent Validity Concurrent, or status, validity indicates the cor- 
respondence, or relationship, between a measure and the more or less 
immediate behavior or performance of identifiable groups. The dif- 
ference between concurrent validity and predictive validity is solely 
a matter of time. Predictive validity requires correspondence with a 
future criterion whereas concurrent validity requires correspondence 
with the criterion at the time of testing or diagnosis. If the Thematic 
Apperception Test provides a relatively successful diagnosis of some 
personality difficulties of an individual at the time of testing, the 
degree of concurrent, or status, validity in doing so may thus be 
established. In a like manner, a personal report or personality test 
which gives a reasonably accurate measure of diagnosis of personal 
adjustment or maladjustment at the time of administration has a 
degree of concurrent validity. An opinion questionnaire or interview 
for immediate and not for future use is another example of a situa- 
tion in which the need for concurrent validity applies. If the results 
of checklists of reading or arithmetic disabilities, a sociometric anal- 
ysis of a class, an observational method, or an interview are to be 
used for immediate diagnosis, these techniques must be judged for 
their effectiveness in terms of concurrent validity-how well they 
measure present behavior or performance. 


Types, Uses, and Qualities of Major Evaluation Techniques 45 


Content Validity Content validity is judged by the degree of rela- 
tionship or correspondence between a measure or diagnostic tech- 
nique and achievement in the specific course or curriculum. In a 
technical sense, the test or technique will sample a universe of pos- 
sible behaviors, content, or activities of a course or curriculum. Àn 
academic achievement test in history, science, or literature is exam- 
ined frequently for content validity by checking the test items against 
courses of study and textbooks or curriculum guides. Performance 
tests for trade or industrial arts courses may be examined for content 
validity by checking them against a detailed job analysis. 


Construct Validity Construct validity may be established by indi- 
cating the correspondence or relationship between the results of a 
technique of measurement and other indicators of the characteristic 
or characteristics that are measured or assessed. This type of valida- 
tion is often used for tests or measures of a psychological characteris- 
tic that is assumed to exist by empirical or theoretical deduction. 
Briefly defined, a psychological construct is an ability, aptitude, 
characteristic, or trait that is assumed to exist to explain some aspect 
of human behavior. For example, it may be assumed that general 
mental ability comprises such fairly independent factors as verbal 
ability, number ability, perceptual ability, space ability, reasoning, 
ablish the construct validity of a test of 


and memory. In order to est t 
number factor, it may be necessary for an investigator to correlate the 


results of a test of number factor with results of other tests involv- 
ing not only number reasoning and manipulation, but also other 
types of mental factors or abilities, to make systematic observations 
of persons having known scores on the test, and to show that the test 
discriminates between groups widely spaced in ability to manipulate 


and reason with numbers. 


The same approach might be used to judge the validity of inter- 


` pretations of personality characteristics that are assumed to be meas- 
ured or diagnosed in a projective technique, such as the Rorschach 
ink blots. It may be'assumed that individuals who perceive forms on 
the Rorschach are better able to resist emotional stress. Validation 
would call for evidence that scores or results vary from person to 
person or from occasion to occasion as the theory would require. 
Often construct validity is established by considering together many 
different kinds of incomplete but complementary evidence. Thus, it 
on of the test or technique with other tests, on 


may rely on correlati: ; 
systematic observation of persons having known scores or results, 


46 Nature and Scope of Evaluation 


and on evidence that the test discriminates between widely spaced 
groups. If it is assumed that a score on ^neuroticism" from a per- 
sonality measure is associated positively or negatively with persons 
having certain defined physiques and with their total temperamental 
pattern, including their adjustment to certain occupations, the proof 
can best be established by the construct validity methods just cited. 


Establishing the Validity of Various Techniques 


Observational techniques, including rating scales, checklists, and 
anecdotal records, will use one or more of these methods of estab- 
lishing validity depending upon its purpose. This is true for the inter- 
view, the case study, and cumulative records when these are used as 
methods for assessing an individual's behavior. 

It should be noted, however, that in predictive and concurrent 
validity the criterion of future performance or immediate performance 
is of major concern, and the test or measure is of interest only as an 
indirect estimate of the criterion behavior. In measures of achieve- 
ment and performance where content validity is established, the test 
behavior, as a sample of the universe of appropriate content or be- 
haviors, is the characteristic with which the examiner is chiefly con- 
cerned. Construct validity is ordinarily studied when several indirect 
measures of some characteristic are available and it is desired to show 
that the test or instrument measures a theoretical characteristic or 
construct. None of the indirect measures may be a good measure of 
the theoretical characteristic or construct, yet each measure supports 
or complements others. The characteristic is of central importance 
rather than the test behavior itself. 

This survey of validity indicates the central importance of a mean- 
ingful criterion, and clearly indicates the complex and difficult nature 
of establishing validity. Satisfactory criterion measures are difficult to 
achieve. Criteria for judging proficiency in a job, a course of study, 
or in personal-social adjustment require an immense investment of 
time and professional skill, despite which the results are often limited 
in scope and of low reliability. The limitations which these difficulties 
represent lead to the conclusion that obtaining satisfactory criterion 


data is perhaps the most difficult and costly aspect of measurement 
and evaluation. 


RELIABILITY 


Reliability provides an index of the accuracy with which a test or 
instrument measures. Reliability is commonly defined as an estimate 


Types, Uses, and Qualities of Major Evaluation Techniques 47 


of the degree of consistency or constancy among repeated measure- 
ments of individuals with the same instrument. Whenever any physi- 
cal object or psychological characteristic is measured, that measure- 
ment contains some degree of chance error. If the chance errors are 
small relative to the variation, or range, of measures of the object or 
characteristic measured, the reliability or consistency of measures is 
high. Thus, the reliability of a measure of height of adults is high 
when the majority of chance errors are less than 14 inch relative to 
a range from 60 to 76 inches. 

Reliability is usually expressed as a coefficient of reliability, but is 
sometimes expressed as the standard error of measurement. The stand- 
ard error of measurement may be termed absolute consistency. This 
degree of absolute consistency may be observed in the actual amount 
of variation which results when a particular measuring instrument is 
applied more than once to the same individual. It is to be noted 
that the standard deviation of such a distribution of repeated meas- 
urements is also referred to as the standard error of measure- 
ment. 

Relative consistency, expressed by the reliability coefficient, is more 
frequently used in psychological measures. It is impossible to measure 
the same individual repeatedly by means of psychological tests with- 
out directly affecting the psychological characteristic. For example, 
repeated administration of a reading test will directly affect the read- 


ing ability of the individual tested. In order to meet this difficulty in 


measuring psychological characteristics, methods of estimating the 


reliability coefficient have been established. , 
The major procedures for estimating coefficients may be briefly 


summarized as follows: (a) Administration of two equivalent tests 
and correlation of the resulting scores, (b) Repeated administration 
of the same test or testing procedures and correlation of the resulting 
scores, (c) Subdivision of a single test into two presumably equivalent 
halves, each scored separately, and the correlation of the resulting 
two scores, (d) Analysis of variance among individual items and de- 
termination of the error variance from this statistic. 

Reliability may thus be considered as the degree to which a true or 
perfect measurement of each individual is obtained when a measure 
is applied. It is impossible, however, to obtain a perfect measure in 
the physical sciences or in the biological sciences. The measure is 
always contaminated or made impure by chance factors which affect 
the accuracy of the measurement. Some epa ere of variation 
in psychological measurement are indicated in the following: 


48 Nature and Scope of Evaluation 


1. Actual difference among individuals in the psychological 
characteristic being assessed or measured. This applies to 
both general and specific characteristics, abilities, and 
skills. 

2. Differences in abilities to take a specific test, such as: abil- 
ity to comprehend directions or instructions, effects of 
practice in taking previous tests, and facility in dealing 
with specific test exercises. 

8. Differences associated with chance factors, such as: fluc- 
tuations in performance, memory, or reasoning, fortunate 
selection of answers by guessing, and unique possession 
of particular fact, knowledge, or opinion in an exercise. 

4. Differences of a personal but temporary nature affecting 
the performance of the individual, such as: health, energy, 
fatigue, motivation, and emotional tension. 

5. Differences associated with external conditions, such as: 


heat, light, ventilation, noise, broken pencil, and inter- 
ference. 


Types of Reliability Coefficients 


Reliability is a general term and refers to several types of evidence 
regarding the consistency of measurement. Different types of relia- 
bility coefficients answer different questions and permit different in- 
ferences regarding the evidence. Three major types of reliability 
coefficients are generally used in describing consistency of measure- 
ment for psychological tests and techniques. These are: (a) coefficient 


of internal consistency, (b) coefficient of equivalence, and (c) coeffi- 
cient of stability. 


Coefficient of Internal Consistency The coefficient of internal con- 
sistency is the estimate obtained from the single administration of a 
test or instrument to a representative group of individuals, It indi- 
cates how accurately or consistently the test or instrument measures 
the individual's performance at a particular moment. It provides an 
estimate of the probable variability in his score or result if a different 
sample of questions, activities, or behaviors were examined or ob- 
served. Two methods are generally used to estimate this coefficient. - 
They are: (a) split-half method and (b) Kuder-Richardson method. 
In both of these methods the same general trait, ability, or charac- 


teristic of a homogeneous nature should be measured and the test 
should not be speeded. 


Types, Uses, and Qualities of Major Evaluation Techniques 49 


The method of estimating reliability for a test by the split-half 
method may be described briefly as follows: The test is divided into 
two halves as equivalent in difficulty of items and in content as is 
possible. Each individual's test is scored so as to obtain a score for 
the first half, a score for the second half, and a score for the total test. 
The agreement or correlation between the halves is determined, and 
the Spearman-Brown formula (4:40-41) or Guttman formula (3) is 
used to estimate the reliability of the test. 

In the Kuder-Richardson formula (5) the coefficient is estimated 
by determining the variance of the individual items, namely, multi- 
plying the percentage passing by the percentage failing each item 
and finding the sum for all items. This is divided by the variance of 
the total test, namely, the standard deviation of the total scores 
squared. The resulting ratio is subtracted from unity and adjusted 
for the number of items in the test. 

In the split-half method, the resulting coefficient is a serious under- 
estimate if the halves of the test are not closely equivalent in difficulty 
and content. In the Kuder-Richardson formula, the resulting coeffi- 
cient is an underestimate if the test does not measure a homogeneous 
ability or characteristic. In both methods, the coefficient is spuriously 


high if the test is speeded. 


Coefficient of Equivalence The coefficient of equivalence, as in the 
case of the coefficient of consistency, may be defined as an index of 
how consistently the test measures the individual's performance at a 
particular moment. It measures fluctuations from day-to-day in the 
individual and fluctuations in the sampling of content of the test or 
measure. The preferred method of determining this coefficient is to 
administer on two occasions to the same individuals two parallel 
forms of the test or measure, each form containing different ques- 
tions or content but which can reasonably be assumed or proved to 
be equivalent. The correlation between the scores on the two parallel 
forms is called the coefficient of equivalence cae? it measures the 
relationship between equivalent forms of the test. 
A less precise method of obtaining the coefficient of equivalence is 
to use the split-half coefficient based on equated parts of a test, pro- 
- vided the equated parts of the test are carefully documented for 
equality in difficulty of the items and equality of content of the char- 
acteristic or abilities that are being assessed. 
To the extent that parallel forms of the test are not equivalent in 
difficulty and content, or split halves of the test are not equivalent in 


50 Nature and Scope of Evaluation 


difficulty and content, the correlation coefficient will not provide a 
true estimate of the reliability of the test. Here again it is assumed 


that proper precautions will be observed regarding the homogeneity 
of the characteristic measured. 


Coefficient of Stability The coefficient of stability indicates the de- 
gree to which the scores on a particular test or measure are stable over 
a given period of time. It indicates whether a sample of behavior 
observed or assessed at one time is typical of behavior at subsequent 
times. This coefficient is estimated by the test and retest method. The 
same test is administered to the same individuals after an intervening 
period of time. A somewhat similar coefficient is obtained when two 
highly equivalent forms of the test are administered with an interven- 
ing period of time. The correlations between the scores on the two 
tests or measures is computed in order to obtain the coefficient. 


Factors Related to Reliability Coefficients 


From the brief descriptions that have been provided and by infer- 
ence from the method of obtaining reliability coefficients, the follow- 
ing generalizations may be made: 


a. The reliability coefficient depends on the length of the 
test or instrument. 

b. The reliability coefficient, since it deals with the variance 
of item performance as a ratio to variance on total test 


performance, is closely related to the spread or range of 
scores in the group studied. 


Application to Various Types of Evaluative Instruments 


Although reliability coefficients have been most widely used among 
the objective types of tests and measures, they may be applied to 
various other evaluative methods. In data obtained by direct obser- 
vational methods, for example, the results are often analyzed in terms 
of sampling error or reliability and observer error or reliability. 

The sampling error answers the question: Is the cross section of 
behavior obtained by observation typical or characteristic of the 
individual or group? This may be determined by splitting the obser- 
vations into halves and finding the relationship between the halves. 
Observer error or reliability answers the question: What is the con- 
sistency of observations among independent observers, or to what 
degree do the observers allow conscious or unconscious biases to 
influence their report? This may be determined by correlating the 


Types, Uses, and Qualities of Major Evaluation Techniques 51 


results for two independent observers, who observed the same situa- 
tions at the same times. 

In rating methods experience has shown that the reliability of 
ratings may be increased by pooling the judgments of a number of 
persons. The number of judgments that should be pooled or averaged 
varies according to the degree of reliability sought and the nature 
of the trait rated. For many traits three or more independent ratings 
should be obtained and pooled. The reliability of ratings may be in- 
creased, also, by analyzing a trait into a number of subtraits and ask- 
ing each judge to rate the subdivisions. The reliability of the struc- 
tured interview may be studied by techniques similar to those used 
for rating methods. 

The reliability of the Rorschach and similar projective techniques 
has been examined by the test and retest method. In sociometric meth- 
ods, questionnaires, and inventories the test and retest method has 
also been used to determine reliability. 


OBJECTIVITY 

Objectivity is an attribute of a test or instrument so constructed 
that identical or very closely similar scores are assigned by different 
but equally competent scorers. In a highly objective instrument the 
scores assigned are not affected by the judgment, personal opinion, or 
bias of the scorers. 

In general, tests or instruments that possess the quality of objec- 
tivity are to be preferred to those in which the opinion of the scorer 
unduly influences the results. The administration of group tests of 
intelligence, achievement, and aptitude to large groups generally re- 
quires that the answers be scorable with high objectivity. On the 
other hand, flexibility with regard to objectivity is sometimes advan- 
tageous, especially in diagnostic and clinical work. Thus, the type of 
instrument, the purpose of the evaluation, and the technical compe- 
tence of the examiner require relative degrees of objectivity rather 
than a fixed standard. 


Techniques with High Objectivity 


Standardized group tests of intelligence, achievement, aptitude, 
attitudes, and interests have high objectivity because they are pro- 
vided with a scoring key which permits a competent scorer to deter- 
mine without exercise of much judgment or personal opinion the 
right or wrong answers to the items of the test. All of these instru- 
ments are so designed in their structure that they permit the examinee 


52 Nature and Scope of Evaluation 


certain choices of standard answers and the answer selected has an 
unequivocal standard value to contribute to the total score. 


Techniques with Moderate Objectivity 


Tests of intelligence such as the Binet and Wechsler-Bellevue, 
which are administered individually and orally by a clinically trained 
person, have moderate objectivity. Included in this category, also, 
are the projective techniques such as the Rorschach and the Thematic 
Apperception Test. In measures of this type the scoring and inter- 
pretation of the test or technique, while it has a high degree of stand- 
ardization and uniformity, permits the examiner to use his judgment 
in assessing certain of the values and making interpretations of the 
responses of the examinee. Closely associated with these are the 
performance tests in which the examinee manipulates apparatus and 
in which the examiner, by observation, makes judgments regarding 
the performance of the individual and translates the judgment into 
the score or rating to be assigned. The checklist depends upon obser- 
vation and judgment and therefore is classified as an instrument with 
moderate objectivity. Essentially the same factors of judgment oper- 
ate when teachers use rating scales of products such as a handwriting 
scale or drawing scale. Direct observation, with categories of be- 
havior to be observed defined in advance, also requires a moderate 
degree of judgment on the part of the observer in assessing the value 
or score to be assigned to the individual under observation. 


Techniques with Flexible Objectivity 


Flexible objectivity is especially desirable where the approach is 
clinical or pseudo-clinical. The major evaluation methods to be classi- 
fied under this category are the unstructured interview, the open-end 
questionnaire, the anecdotal record, a running account of behavior 
by direct observation, and interpretation of data from projective 


techniques involving analysis of drawings, handwriting, and similar 
stimuli. 


NORMS 


'The results obtained from a test or other evaluative technique re- 
quire interpretation. These results take on added meaning as they 
are compared or contrasted with those obtained from different kinds 
of persons, or populations, to whom the tests or techniques have been 
administered. The raw score, or number of items correct, on a stand- 
ardized test becomes much more meaningful when compared with 


Types, Uses, and Qualities of Major Evaluation Techniques 53 


the average score obtained by reference groups arranged according 
to age, grade, years of study, or type of person. 

A standardized test, for example, should have norms which will aid 
in the interpretation of the raw score obtained. For this reason, au- 
thors and publishers provide such norms as the following: 


a. Age norms for converting a raw score into a mental age, 
reading age, arithmetic age, etc. 

- b. Grade norms for converting a raw score into the perform- 
ance on the test of the average pupil at a given grade 
level. 

c. Percentile norms for converting a raw score into a com- 
parison with the percentage of pupils of a given age or 
grade who obtained that raw score. 

d. Standard score norms for converting the raw score into a 
standard deviation from the mean, or average, for a given 
age, grade, or other reference group. 


These basic norms are defined and described in more detail in 
Chapter Four. Such norms are called derived scores and help in the 
interpretation of test results. Variations of these types of norms have 
been devised by different test authors and publishers. 

Among the recommendations of various committees, which are at- 
tempting to establish standards for tests and diagnostic techniques, 


the following are included: 


a. Tables or scales used to report scores should be designed 
to permit easy and accurate interpretation by the test user. 

b. Norms should refer to clearly defined and described popu- 
lations, or reference groups, such as grade, age, curricu- 
lum, or occupational group. These populations should be 
the groups to whom users of the test will usually wish to 
compare the individual or group tested. 

c. Norms should be based upon a representative cross sec- 
tion, or sample, of the defined population or reference 
group. 

d. Although norms may be reported in terms of grade or 
age groups, it is desirable, also, to provide percentile 
equivalents or standard scores into which raw scores may 
be converted. 

e. For some uses of tests, local norms are more appropriate 
and important than national regional or other group 
norms. In such cases, the test manual should suggest ap- 
propriate use of local norms. 


54 Nature and Scope of Evaluation 


These criteria, or standards, may well be applied by the test user 
in the evaluation and selection of specific tests. Unless a test can 
meet these desirable standards in the reporting of norms, its use may 
be of doubtful value. 


Qualitative Norms 


As in the case of objectivity, so in the case of norms, flexibility of 
interpretation is desirable in diagnostic and clinical types of instru- 
ments. Under the categories of evaluative methods for which flexi- 
bility in norms and interpretations is desirable are the following: 


a. Sociometric techniques, in which the qualitative and de- 
scriptive relationships among individuals and cliques can 
be more adequately communicated by qualitative rather 
than quantitative terms. 

b. Interview, especially the unstructured interview, in which 
the purpose is to diagnose and to provide treatment for 
some behavior disorder or maladjustment. 

c. The open-end questionnaire, in which the various degrees 
of response are better expressed in qualitative and de- 
scriptive terms than in quantitative terms. 

d. Anecdotal records, in which the major purpose is to obtain 


a descriptive and diagnostic picture of the individual's typ- 
ical behavior in various situations. 


PRACTICABILITY 


The qualities of practicability in a test or evaluative technique 
involve consideration of such factors as cost, ease of administration, 
ease of scoring, ease of interpretation, time requirements, and avail- 
ability of comparable forms of tests or techniques. In addition, the 
test or technique must be reasonably acceptable to the persons to 
whom it is administered and to the persons who use the results. This 
means that the test or technique must inspire a feeling of reality and 
purpose in the sense that some direct relationship may be observed 
between the exercises or content of the test or technique and the 
psychological characteristic that is to be appraised. Furthermore, the 


typographical and physical appearance of the test or technique should 
be as attractive and interesting as is feasible. 


Cost 


The cost of tests or techniques which measure the same psycho- 
logical characteristic may vary from publisher to publisher. When a 
school system is testing a large number of pupils, a difference in the 


Types, Uses, and Qualities of Major Evaluation Techniques 55 


cost of a test or other evaluative technique may be an important 
consideration. It is wise, therefore, for teachers and supervisors to 
examine tests and techniques for measuring various psychological 
characteristics and to obtain the test or technique of highest quality 
that may be obtained within the budget provided by the school system. 


Ease of Administration 


Another practical consideration is the ease of administering a test. 
Some tests require the services of expertly trained examiners. Unless 
such specially trained personnel are available, it is impossible to use 
the tests and to insure validity and reliability of the results. The use 
of any test or technique should be judged in terms of the related 
competencies of the personnel available for administering the test and 
the degree of expertness that is required for obtaining accurate results. 


Ease of Scoring 


Ease of scoring or analyzing primary data is an equally important 
consideration. Some techniques, such as the individual Binet exami- 
nation for intelligence and the Rorschach projective technique for 
personality appraisal, require expertly trained scorers. Some group 
tests use multiple and time-consuming scoring methods in order to 
arrive at differential scores for different psychological characteristics. 
A test or technique should be reviewed carefully to determine how 
practicable it may be in a specific school situation in terms of the 
degree of expertness, as well as the time, required for scoring. 


Ease of Interpretation 


Ease of interpretation of the results from various tests and tech- 
niques is another quality that must be judged. Some tests of person- 
ality, attitudes, aptitudes, and interests, for example, may require 
specially trained personnel in order that the results may be inter- 
preted validly. When it is proposed to use techniques in which the 
interpretation of results appears rather complex, the technique should 
be studied carefully to determine whether or not it is feasible to in- 
terpret the results with the competencies of school personnel avail- 
able for this purpose. 


Time 

Time requirements are frequently an important consideration. In 
many schools, as well as in business and industry, a limited time 
may be available to administer a test or technique. Because of the 
difficulties in rearranging class schedules, it may be necessary to use 


56 Nature and Scope of Evaluation 


a short test rather than a longer and more comprehensive one which 
would give more valid and reliable results. On the other hand, it 
may be more feasible to arrange for the administration of several 
short tests or techniques that will give a more complete description of 
the individual than a single longer test or technique. In addition to 
these administrative factors, the length of a test or technique has an 
important effect upon the cooperation, interest, and effort of the indi- 
vidual who is examined. 


Comparable Forms 


Parallel or comparable forms of a test or technique are especially 
valuable when these are used for research purposes or for measuring 
the effects of teaching or of therapy. When such purposes are to be 
served, it is essential that parallel forms of the test be available for 
administration before and after a particular course or a particular 
period of therapy. A parallel form is frequently valuable to confirm 
a test score which may be inaccurate because of some interference, 


either physical or emotional, when the first form of the test was 
administered. 


Summary 


In this chapter, types, uses, and qualities of major evaluative tech- 
niques have been surveyed. The objective type tests, as one of the 
major techniques of evaluation, may be classified according to such 
psychological characteristics as intelligence, achievement, aptitude, 
attitude, interest, and personality. These tests may be classified, also, 
according to such technical features as individual versus group ad- 
ministration, subjective versus objective Scoring, standardized versus 
informal norms, speed versus power, performance versus pencil-and- 
paper situations, and survey versus diagnostic purposes. Other major 
evaluation techniques which were introduced include: anecdotal rec- 
ords and observational techniques; oral and essay examinations; ques- 
tionnaires, inventories, and interviews; checklists and rating scales; 
personal reports and projective techniques; sociometric methods; case 
studies; and cumulative records. All of these techniques are explained 
and illustrated in more detail in subsequent chapters of this volume. 

Uses of evaluation data were discussed according to the functions 
they serve for such school personnel as the administrator, research 
worker, supervisor, teacher, and guidance counselor. The adminis- 
trator, for instance, uses evaluation data for official records, reports 


Types, Uses, and Qualities of Major Evaluation Techniques 57 


to parents, transfer and placement of pupils. The research worker 
uses evaluation data to make analytical and comparative studies of 
pupil learning and adjustment. In direct instruction the teacher and 
supervisor use the data from evaluation techniques to adapt school 
activities and curricula to promote mental, physical, social, and emo- 
tional growth and development of pupils. In a like manner, the psy- 
chologist or counselor uses results from , evaluative techniques for 
educational and vocational guidance. Along with the desirable uses 
of evaluation data, it is wise to guard against the misuse of such 
data. 

The qualities for judging an evaluative technique include validity, 
reliability, objectivity, norms, and practicability. Validity is that qual- 
ity which indicates the degree to which the evaluative instrument 
measures or diagnoses the psychological characteristics that it pur- 
ports to measure. Predictive validity is judged by the degree of the 
relationship between a measure and a subsequent measure or judg- 
ment of the performance or behavior, such as predicting aptitude for 
educational or occupational activities. Concurrent validity is judged 
by the relationship between a measure and an immediate criterion 
of behavior or performance. Content validity is judged by the rela- 
tionship between a measure or diagnostic technique and achievement 
in an activity or course of study which is the goal of training or guid- 
ance. Construct validity is established by the correspondence be- 
tween the results of a technique and other direct or indirect indicators 
of the characteristic or characteristics that are assumed to exist 
through empirical or theoretical deduction. 

Reliability is that quality which is judged by the degree of ac- 
curacy, consistency, or constancy of the measure obtained. Reliability 
may be expressed as the standard error of measure, or the actual 
amount of variation in scores which results when an individual is 
measured repeatedly with the same instrument. Reliability may be 
expressed, also, as a coefficient of reliability. Since the same psycho- 
logical test cannot be applied repeatedly to an individual without 
altering the psychological characteristics measured, the coefficient of 
reliability is employed for tests. Three major types of reliability coeffi- 
cients are: coefficient of internal consistency, coefficient of equiva- 
lence; and coefficient of stability. The coefficient of internal con- 
sistency estimates how accurately an instrument measures the indi- 
vidual's performance at a particular moment. It is obtained by divid- 
ing a test into equivalent halves and correlating the scores or by 
determining the variance of individual items and comparing the sum 


58 Nature and Scope of Evaluation 


of this variance with the variance of the total test score. The coeff- 
cient of equivalence is obtained from the correlation between scores 
on two parallel forms of a test or measure administered on two occa- 
sions to the same individuals. The coefficient of stability is obtained 
from the correlation between scores on the same test or instrument 
administered to the same individuals at an earlier and later date, thus 
measuring the stability of the behavior assessed over a given period of 
time. 

Objectivity is that quality of a test or instrument which permits 
equally competent scorers to obtain identical or very closely similar 
scores on the same test or instrument administered to an individual. 
Techniques with high objectivity include standardized group tests of 
intelligence, achievement, aptitude, attitude, and interests. Techniques 
with moderate objectivity include individually administered tests of 
intelligence and projective techniques, such as the Rorschach and the 
Thematic Apperception Test, as well as performance tests, rating 
scales of products, checklists, and rating scales. Techniques with flex- 
ible objectivity include the unstructured interview, the open-end ques- 
tionnaire, anecdotal records, and running accounts of behavior ob- 
tained by direct observation. 

Norms are usually expressed as the average or typical score ob- 
tained on a test or instrument for individuals of a specified age or 
grade. Qualitative or descriptive norms are sometimes more appropri- 
ate and meaningful in summarizing results from such techniques as 
sociometric methods, interviews, open-end questionnaires, and anec- 
dotal records. 

The quality of practicability in a test or evaluative technique in- 
volves consideration of such factors as cost, ease of administration, of 
scoring, and of interpretation, as well as time requirements and avail- 
ability of comparable forms of tests or techniques. 


Problems for Class Discussion 


1. By interview and observation in a specific school, determine the uses 
made of intelligence and achievement test data by: (a) the school ad- 
ministrator, (b) the teacher, (c) the guidance or other special school 
personnel. 

2. Examine carefully the manual or other related publication explaining the 
validity, reliability, objectivity, and norms of two standardized tests in a 
subject you teach or are planning to teach. For each of these qualities, 
indicate briefly the kind of data provided by the author and publishers 
of the tests you have chosen. Give your judgment about the adequacy 
of such data as are reported. 


Types, Uses, and Qualities of Major Evaluation Techniques 59 


References Cited in This Chapter 


1. American Psychological Association, “Technical Recommendations for 
Psychological Tests and Diagnostic Techniques,” Supplement to the 
Psychological Bulletin, 51:1-88, March, 1954. 

2. Cronbach, Lee J., “Response Sets and Test Validity,” Educational and 
Psychological Measurement, 6:475—494, 1946. 

8. Guttman, Louis, “A Basis for Analyzing Test-Retest Reliability,” Psycho- 
metrika, 10:255-282, 1945. 

4, Kelley, Truman L., Interpretation of Educational Measurements. Yon- 
kers: World Book Company, 1927. . 

5. Kuder, George F., and Richardson, Marion W., "The Theory of Estima- 
tion of Test Reliability," Psychometrika, 2:151-160, 1937. 


References for Further Reading 


Darley, John G., "Statistics and the Understanding of Tests," Testing and 
Counseling in the High School Guidance Program. Chicago: Science Re- 
search Associates, 1948, p. 45-87. 3 

In relatively nontechnical language and by simple illustrations, the 
computation and interpretation of such statistics as coefficient of reliability, 
percentile scores, and standard error of a measure, are presented. This 
reference discusses the meaning and application of statistical concepts 
related to tests and measures. 

Greene, Edward B., Measurements of Human Behavior. New York: Odyssey 
Press, 1941. 

In this volume the author provides a brief overview of the variety of 
tests and appraisal methods in Chapter 2. Characteristics of a satisfactory 
test are discussed in Chapter 5. In Chapters 18 and 14 uses of test results 
for class diagnosis, individual diagnosis, and guidance are presented. 


Lindquist, E. F., Educational Measurement. Washington, D. C.: American 
Council on Education, 1951. 

This excellent volume provides a detailed statement of the uses of 
educational tests and measures in its opening chapters. Among the sub- 
sequent chapters the topics of reliability, validity, and norms are thor- 
oughly explored. The discussions on these technical characteristics are 
comprehensive and detailed. 


Administrative Aspects 


of an Evaluation Program 


CHAPTER FOUR 


Many obstacles stand in the way of realizing a com- 
prehensive program of evaluation in a school or a department of a 
school. These obstacles include lack of sufficient personnel, lack of suf- 
ficient materials, lack of clerical assistance, and lack of budgetary ap- 
propriations for the purchase of tests and measures. Within these limi- 
tations, however, it is possible to introduce and to continue the ad- 
ministration of a comprehensive evaluation program. 

Basically, the leadership of a principal or supervisor will influence 
widely the type of evaluation program in a local school. The, principal 
or supervisor must provide the leadership as well as manage effectively 
the personnel and materials that will be necessary to carry out the 
program that is planned. The administrator or supervisor must plan 
teachers’ meetings and in-service programs on evaluation. He should 
guide the committee work of teachers in the selection and administra- 
tion of standardized tests or in the construction of informal tests and 
should have available important reference books or handbooks on 
tests and related statistics. 

The classroom teacher plays a major part in an evaluation program. 
If he is to use the results for guiding the growth and development of 
individual pupils and the class, then the use and interpretation of 
evaluation results become the ultimate responsibility of the class- 
room teacher who deals directly with the pupil. The teacher will use 
the results of formal and informal evaluation to plan the curriculum 
or course so that instruction is based on the needs of the pupils. The 
teacher also will participate directly or indirectly in the construction, 
criticism, and revision of any informal tests that are developed. The 
teacher and administrator or supervisor must work as a team in under- 
taking an evaluation program. 

60 


Administrative Aspects of an Evaluation Program 61 


Introducing the Evaluation Program 


A comprehensive evaluation program should be initiated gradually 
so that the teachers will not be overburdened by too many new tests 
and techniques—all introduced at the same time. Even though it may 
seem desirable to evaluate a wide range of major objectives, it is wise 
to introduce the measurement techniques only as rapidly as they can 
readily be assimilated and used by the teachers. 

Several considerations will determine the rapidity with which an 
evaluation program can be introduced. The design of the current 
testing or evaluation practices and its acceptance by school personnel 
is of major importance. Schools with more advanced practices can 
proceed more rapidly than others. The background and the interests 
of teachers will vary from school to school, and these individual dif- 
ferences must be considered in the formulation of an evaluation pro- 
gram. In some schools teachers will have been introduced to more 
extensive evaluation programs than in other schools. The formulation 
of an evaluation program must provide opportunities for discussion 
and for in-service education of teachers on newer trends in evaluation 
and measurement, especially the application of results of measure- 
ment to the instructional program. This is a part of the responsibility 
of the principal or supervisor who is leading the program. 


LOCAL SCHOOL NEEDS 

The scope and nature of an evaluation program designed to meet 
local school needs will vary from school to school. The type of pupil 
personnel in the school, the type of curriculum or course offered and 
the adaptations of these to fit the pupil personnel are factors which 
must be considered. The major objectives to be emphasized—attitudes, 
thinking, information and skills, personal and social adjustment—must 
be determined. These major objectives should be determined by the 
teachers and supervisor in the school or in the department involved. 
In some schools or departments, the major objectives that are selected 
will involve mainly the measurement of skills, information, concepts, 
and knowledges. In others, the objectives may include not only skills, 
information and concepts, but also attitudes, interests, critical think- 
ing, and the personal and social adjustment of pupils. Each school or 
department must determine or select the major objectives which it 
wishes to emphasize as the goals of a curriculum or a course. 

The objectives selected for emphasis influence the tests and ap- 
praisal technique to be used. The first question is: What formal or 


62 Nature and Scope of Evaluation 


standardized tests can be used to measure the major objectives and 
what will be their cost? The answer will involve a survey of published 
‘test materials and consultations with test experts in order that a wise 
selection of published tests can be made. A second question is: What 
informal or teacher-made tests and techniques can be used? Stand- 
ardized tests are valuable for a periodic check, probably once each 
term or each year, but informal or teacher-made tests may be required 
to measure growth and progress over shorter periods of time, espe- 
cially for objectives for which published tests are not available. An- 
other function of informal or teacher-made tests is their use for in- 
structional purposes in order to locate strengths and weaknesses in 
the teaching and learning of specific units of work. The scope and 
nature of the informal tests must be determined by the teacher in 
consultation with the supervisor. 


ORGANIZING PERSONNEL AND MATERIALS 


The operation of an evaluation program in a local school calls for a 
wide measure of administrative efficiency. First, in an elementary 
school or in the departments of a secondary school, it is essential to 
delegate to selected teacher personnel various responsibilities in an 
evaluation program. Key personnel should be chosen from those who 
have interests and abilities in developing a more comprehensive eval- 
uation program. In larger schools, this may involve the selection of 
committees and chairmen of committees to put the program into op- 
eration. Another basic consideration is the organization of materials. 
Principals and supervisors can help provide suitable materials and 
equipment needed by the teaching staff for testing. These materials 
should be readily available and up-to-date, and would include: 

1. Books, articles, and pamphlets pertaining to the measure- 
ment of the objectives or courses. 

2. A file of selected standardized tests indexed by objectives. 

3. Equipment needed for construction of any formal test; for 
example, stencils, mimeograph and hectograph materials. 

4. Provision for filing informal teacher-made tests systemat- 
ically so they are readily available when needed. 


Using Tests in an Evaluation Program 


The modern school committed to a program of evaluation will use 
many approaches to appraisal. However, the objective test will con- 
stitute a major aspect of the program the school will ultimately adopt. 


Administrative Aspects of an Evaluation Program 63 


When a school decides to give a test, whether it is one which will 
be specially constructed by the school or one which will be purchased 
from a commercial publisher, the school is actually setting a four-fold 
task for itself: 


1. Constructing or selecting the test to be used. 

2. Administering the test. 

8. Scoring the test. 

4, Interpreting the results obtained through testing. 


The remainder of this chapter considers each of these four aspects 
of the administrative program. 


Constructing or Selecting Tests 


In many school situations, considerable attention is paid to the first 
of these four aspects of testing. The teacher or group of teachers 
charged with the responsibility of formulating a test for use in a 
given class, grade, or school subject will often spend a good deal of 
time in selecting the specific items which will be used. The analysis 
of the course of study and the identification of pupil knowledges, 
skills, and appreciations call for a considerable investment in teacher 
time and effort. (See Chapters Five and Six for a more complete dis- 
cussion of the process of test construction. ) 

When a standardized test is to be purchased, the group may eval- 
uate many possible choices before recommending that a specific test 
be selected. The various qualities characterizing an evaluative instru- 
ment, which were discussed in the preceding chapter—validity, relia- 
bility, objectivity, norms, and practicability-must be considered. 
Members of the group might find it helpful to prepare a checklist of 
qualities desired in a standardized test, and utilize some rating device 
to quantify their subjective impressions. To this end, a scale such as 
that prepared by Rinsland (4) may be of value. The selection of a 


standardized test must be given great care. 

Too often, however, the construction or selection of the test to be 
used is looked upon as the completion of the task. The many impor- 
tant issues in the administration and scoring of tests are frequently 
afforded only casual attention by school personnel. Yet, if the time 
and expense involved in constructing or purchasing tests are not to 
be vitiated, test administration and scoring must be approached with 
as much care as that given to the writing of test items or to judging 


the relative adequacy of standardized tests. 


64 Nature and Scope of Evaluation 


Test Administration 


WHO SHOULD ADMINISTER TESTS? 


In general, tests should be administered to a class by the teacher of 
that class. It is rather surprising that no one questions the ability of 
the teacher to administer tests which are teacher-made, yet many au- 
thorities feel that standardized tests must be administered by specially 
trained examiners. It almost seems as if the teacher-made test is looked 
upon as a relatively unimportant measure of pupil achievement com- 
pared to the standardized test which has been constructed by experts. 
'There is no reason, however, for relegating teacher-made measures to 
a secondary position in the evaluative process, or for feeling that 
teachers are generally incompetent test administrators, If the admin- 
istration of a standardized test calls for special skills which cannot 
be mastered by average teachers, it should not be selected for use in 
the school's testing program. 

Before a standardized test is administered by classroom teachers, it 
would be wise to brief the group. An excellent approach is to have 
the group take the test themselves, using the directions specified by 
the test manual and shortened time intervals. In instances where test 
directions are complicated, or when the test is to be administrated to 
a very young group of pupils, a demonstration of test administration 
to an actual class should be scheduled, and a discussion of procedures 
be led by an experienced examiner. When such training practices are 
utilized, there is no reason why classroom teachers cannot administer 
standardized group tests and obtain valid test scores. 

There are definite advantages associated with having the teacher 
administer a test to his own group. The introduction of a strange ex- 
aminer places entirely too much emphasis upon the fact that the test 
is a departure from normal school routine, that it is particularly im- 
portant, and that there is something "special" about the whole busi- 
ness. The presence of the classroom teacher in the room during the 
test may be a reassuring factor, but it is difficult to eliminate com- 
pletely the added tension which accompanies the unfamiliar exam- 
iner. Moreover, the classroom teacher knows his pupils. He has 
learned, in the course of his work with the class, that Mary will need 
a smile or two of encouragement during the course of the test, that 
David will break at least two pencil points, and that Alan will need 
special help with the test directions. He can gauge pupil functioning 
in the test situation far more adequately than can a special examiner. 

This role of the teacher is important so that care given to selecting 
the test shall not be negated by unfortunate administration of it. 


Administrative Aspects of an Evaluation Program 65 


WHAT PROCEDURE SHOULD BE FOLLOWED? 


As a first step, the teacher should study the directions for the ad- 
ministration of the test. This involves more than cursory reading of the 
directions, particularly when a standardized test is to be administered. 
Rather, each specific direction given in the test manual should be 
checked against the test booklet and, if necessary, appropriate nota- 
tions should be made on a specimen copy of the test booklets. 

Prior to the day of testing, the teacher should assemble the ma- 
terials needed: test booklets, special answer sheets, scrap paper, an 
adequate supply of pencils or crayons, a copy of the test manual, a 
watch or clock, and a sign—“Testing—Do Not Enter." The test should 
be administered in a comfortable, familiar setting. The children's own 
classroom should be used unless that room is very noisy. If warranted, 
arrangements should be made for a shift in rooms prior to the date 
of testing. If any children in the group are not to be tested at the 
scheduled time, provision for their care during the testing period 
should also be made in advance. 

During the testing session, the following precautions should be ob- 
served: 


a. In the lower grades, take the children to the toilet im- 
mediately before the test. 

b. The prepared sign should be placed on the classroom door 
before testing begins. 

c. The children should be seated as far apart as possible to 
prevent copying. Special attention should be given to 
seating handicapped children. 

d. The teacher should make sure that each child has the right 
page before beginning work. He should move about the 
room during the course of the test in order to note 
whether directions are being followed without error. 

e. The children should be encouraged to keep working at 
maximum effort. This should be done by gestures rather 
than by words. 

f. Children who disrupt the testing procedure should be re- 
moved from the room. The teacher should note on the test ` 
booklet or answer sheet observations of unusual reactions 
or any other information about the child that may serve to 
make the test results more meaningful. 

g. The teacher should make certain that all pupils stop work 
promptly when time is called. Test booklets or answer 
sheets should be collected immediately, and then other 
materials should be called for. 


68 Nature and Scope of Evaluation 


the results be available at an early date, but the teacher will gain 
valuable information concerning pupil errors and gain a better under- 
standing of the meaning of the test scores. 

One should not assume, however, that because a teacher, rather 
than a clerk in a. central office, is scoring the test, protection against 
errors in scoring has been assured. Early in the history of the testing 
movement, studies (1, 3) indicated that teacher scoring is not error- 
less. Good procedure will not eliminate errors completely, but will 
reduce errors to a minimum. 


WHAT PROCEDURES SHOULD BE FOLLOWED? 


When a teacher scores the papers of the pupils in his own class, he 
should exert évery effort to make the process as mechanical as pos- 
sible. The teacher should avoid noting to whom the test booklet be- 
longs and should concentrate only on the answers to the test items. 
An impersonal approach decreases the likelihood of inaccurate scor- 
ing. The total task, too, should be divided into smaller units which 
can be handled efficiently and learned easily. It is generally inadvis- 
able, for example, to score the entire test booklet for one pupil be- 
fore going on to the next. Rather, it is preferable to score one or two 
subtests (or pages) for all pupils in the class at one time, and then 
go on to a second set of subtests. As the teacher becomes familiar with 
the scoring key, speed increases rapidly. f 

In school situations where the same test is administered to a num- 
ber of classes, it is advisable to have a group of teachers cooperate 
in test scoring. Each teacher, in such instances, would assume re 
sponsibility for scoring a portion of the’ test booklet. All teachers 
should use the same symbols for checking right and wrong answers. 

Provision must be made, of course, for making equitable assign" 
ments and for providing some means of rotating the test booklets 
from teacher to teacher. Ordinarily, this is done by placing the tests 
for each class in a separate envelope, and attaching a label in the 
form of a control sheet indicating the teacher's assignment and provid- 
ing space for keeping track of completed assignments, Group scoring 
in this fashion is much more economical of teacher time and effort 
and has the additional advantage of eliminating the inequalities re- 
sulting from differences in class size. 

All scoring should be checked systematically. This does not mea? 
that every test booklet must be rescored, but that a random sample 
be checked. The World Book Company, publishers of many stand- 
ardized tests, suggests that the following procedure be used in check- 
ing the scoring of the Metropolitan Achievement Test Battery (2): 


Administrative Aspects of an Evaluation Program 69 


“A recommended method of check scoring is that of selecting five 
papers at random and rescoring all tests in the battery for these five 
pupils. If no errors are found on any of the tests in the battery, it is 
not necessary to rescore any more papers. If one or more errors are 
found in a subtest, five more papers should be rescored for that par- 
ticular subtest. If more than two errors are found in these ten papers, 
then all the papers should be rescored for this subtest. If in the first 
five papers there were three or more errors, it is then necessary to 
score the whole package of tests for this particular subtest. 

"The ideal plan for check scoring is to have one person or, at the 
most, a small number of persons on the staff do all the check scoring. 
If a teacher must check his own work, he should go back to his papers 
after the scoring job has been completed and use the same sampling 
system." 

Most standardized tests provide that scores on subtests be trans- 
ferred to the face sheet of the test booklet. The need for accuracy in 
this process is as great as that needed in scoring. Whenever data are 
transferred in this fashion, provision must be made for checking the 
accuracy of the record which is transferred. When groups of teachers 
participate in check scoring, the control sheet used in scoring should 
provide space for recording progress made in checking both original 


Scoring and recording of scores. 


Interpreting Test Results 


Once the test has. been administered and scored, the teacher is 
faced with the problem of interpreting the obtained results. The raw 
Score which a pupil attains on a test has meaning only when some 
method of comparing his performance with that of other pupils is 
utilized. There are several methods of comparing scores, each of 
which is of value for a particular purpose. These derived scores are 


called norms. 


NORMS 

A norm is usually the average or typical value of a particular psy- 
chological characteristic measured in a specified homogeneous popu- 
lation, For example, it may be the average reading comprehension 
level of a representative sample of all 12-year-old children or all fifth- 
grade children; it may be the median achievement on a particular 
science test of a representative sample of all eleventh-grade pupils tak- 
ing the science course. À norm is a statement of present achievement 
of the group and not a universal standard of accomplishment. In most 


70 Nature and Scope of Evaluation 


cases the average, either the mean or median, achievement of a group 
is taken as the norm, but sometimes other points such as percentiles 
or points on the standard deviation scale are used. Most norms on 
widely used standardized instruments are based upon the scores of 
a fairly large cross section of pupils who live in widely scattered parts 
of the nation. In addition to such national norms, however, local norms 
for particular states, cities, or regions are sometimes used. 


Grade Norm 


A grade norm may be defined as the mean or median achievement 
of pupils in a given school grade on a given standardized test. It may 
also be defined as the average status of pupils in a given grade in re- 
gard to a single factor such as spelling ability, reading comprehen- 
sion, or arithmetic ability. Grade norms are ordinarily based upon the 
assumption that a school system contains twelve grades. Suppose the 
average score obtained by sixth-grade pupils tested at the end of 
January on a reading test is 35 items correct. This is translated into 
a grade score of 6.5, which represents achievement of the average 
pupil in sixth grade, fifth month. Directness of interpretation and ap- 
parent simplicity have caused grade norms to be used very widely 
at the elementary-school level. In spite of their many limitations, they 
probably represent the best available method for making scores com- 
parable for elementary-school achievement tests. They do have direct 
meaning which is a highly desirable feature in any system of derived 
scores. Many teachers, however, confuse a grade norm with a stand- 
ard of work and consider that a given class is doing satisfactory work 
if the class is up to the norm, regardless of other relevant factors such 
as general intelligence level of the pupils, community background 
variables, curriculum deviations, and similar factors. 


Age Norm 


An age norm is a statement of the mean or median performance 
on an intelligence or achievement test by a group of pupils of a desig- 
nated chronological age. Suppose pupils who are 11 years 8 months 
old have an average of 35 items correct on a reading test. The age 
score or norm for 85 items correct would then be 11-8. By disregard- 
ing grade placement, age norms make the tacit assumption that in- 
crease in chronological age is the important consideration, not the 
grade in which recent instruction was received. This factor becomes 
less important as promotion is more and more on the basis of chrono- 
logical age, provided instruction is well suited to the needs of the 


Administrative Aspects of an Evaluation Program 71 


individual child. Age norms prorate growth over a 12-month period 
which is an assumption contrary to that made in setting up grade 
norms, namely to prorate growth over a 10-month school year, which 
assumes that no growth or relatively little growth takes place during 
the summer period. Age norms are especially useful in the standard- 
ization of intelligence, or mental ability, tests. 


Percentile Norm 


A percentile norm may be defined as a point on a scale of measure- 
ment determined by the percentage of individuals in a given popu- 
lation that lies below this point. Percentile norms may be defined also 
as corresponding to the points which divide the total number of cases 
contained in a frequency distribution of a normative group into 100 
equal parts. Each of the 100 parts is assumed to contain the same 
number of cases. Although the more common method of reporting 
norms is in terms of the median, which is the same as the 50th per- 
centile, this is frequently supplemented by a statement of other per- 
centile points in the frequency distribution for a given age or for a 
given grade. A tenth-grade pupil, for example, may obtain a raw 
score of 37 on a reading test. This raw score is equivalent to a per- 
centile score of 63. This score means that this pupil achieves a score 
above 63 per cent of tenth-grade pupils. Percentile norms are widely 
used on readiness tests for first-grade children, on achievement tests 
in various subjects for high-school children, on interest inventories, 


personality inventories, and rating scales. 


Standard Score 

A standard score is expressed as a deviation of a score from the 
arithmetic average of the normative group in which the standard devi- 
ation of the normative group is used as the unit of measurement. Thus, 
if the average raw score of the normative group is 50 and the raw score 
standard deviation is 10, a raw score of 60 would be exactly 1 standard 
deviation unit above the average, and the standard score would be, 1.0. 
The basic formula for determining the standard score is: Score minus 
mean score divided by standard deviation of the normative group. In 
standard scores all differences between individuals retain their same 
relative values. In the basic formula standard scores use a pair of 
constants which result in a mean of zero and a standard deviation of 
one for the group used as a standard because the mean raw score is 
subtracted from each score and then each difference is divided by the 
raw score standard deviation. In order to avoid the use of decimals 


72 Nature and. Scope of Evaluation 


and negative signs, standard scores are frequently converted to vari- 
ous relative scales. Some variations of the standard score, technique 
use mean values such as 50, 100, or 500 instead of zero, with standard 
deviations of 10, 20, and 100 respectively instead of 1. Such scores 
simplify interpretation and increase comparability. Standard scores 
are generally more difficult for the average classroom teacher to un- 
derstand than the other derived scores. For this reason the standard 
score has tended to be used most frequently among psychologists 
and research workers. 


ANALYSIS OF ERRORS 


The determination of an individual pupil's standing in a group is 
only one of the ways in which a teacher uses the results obtained on 
a test. He is also concerned with determining which of the areas cov- 
ered by the test presented the most difficulty to the pupil. A tabula- 
tion of incorrect responses serves to develop this information. 

One way in which such a tabulation may be made is illustrated in 
the figure on page 78 (2). In this illustration, check marks represent 
errors on given items, and zeros represent omissions. Totals at the 
bottom of the page give the errors and omissions on each item, while 
totals at the right, the errors and omissions of each pupil. 

An analysis of this type is of great value to the teacher in determin- 
ing individual and group needs for additional help. It indicates where 
additional stress must be placed for greater understanding and where 
a directed program of remedial teaching is needed. While it is true 
that the tabulation suggested involves considerable work on the part 


of the teacher, the values gained generally justify the investment of 
time required. 


Summary 


Only as constructive leadership is provided will a modern program 
of evaluation be realized. The evaluation program should be formu- 
lated jointly by the supervisors and teachers to meet local school 
needs. The evaluation program should be introducéd gradually so that 
the teachers will not be overburdened by too many new tests and 
techniques, all introduced at the same time. An important aspect in 
the administration of an evaluation program is the organization of 
personnel and materials. Competent and experienced personnel should 
be selected for committees to put the program into operation. The ad- 
ministrator should provide suitable materials and equipment, such as 
books, files of tests, mimeograph equipment, and provisions for filing 


73 


$152], juauaaan[oy upjodosjayy Buyasdsaquy 40f jpnupjy “H 5 "ueappH souy jo srsA[euy — [ TANALI 


EEZEBEBEZU EZZZZZZNBNEBENE 
[ons fe 9 rtu [9 |2[9|9[6|6[s [ele] sfafe|melo|eleley 
ojo[ojo|^ y |o] J 


oo 


BB BILE 
ESE 

E SHOTON 

2) wOspIEINL 


Lr 
a 
| ele A] 0} 
«| [e| e| cn] oo) 


o 
Se) 
&6| co] F 


o[ol s» 519 


Hi 


HES 
ERE e ccd 


> ojo 


[o] 

ojo 

ojoj ~ 
of of of sf of S/O] «| 9 


moe pet pe sauren 


Jəqumy wey] ,S[idnd 


74 Nature and Scope of Evaluation 


teacher-made tests. The role of the supervisor must be creative and 
democratic to gain rapport with the entire staff. The role of the class- 
room teachers, however, is central in using the results for guiding the 
growth and development of individual pupils. 

There are four major administrative aspects of a program of testing: 
(a) constructing or selecting the test to be used, (b) administering 
the test, (c) scoring the test, and (d) interpreting the results. Test 
construction involves an analysis of the course of study used and the 
identification of the pupil knowledges, skills, and appreciations which 
are the objectives of the instructional program. The selection of a 
suitible test calls for a consideration of such factors as validity, relia- 
bility, objectivity, norms, and practicability. 

Tests should generally be administered by the classroom teacher, 
who should have received some training prior to the administration 
of the test. Care should be taken to emphasize the need for following 
test directions exactly as given in the test manual and for accurate 
timing. Insofar as possible, test scoring should be routinized. A sys- 
tematic method of check scoring should be developed. 

The interpretation of the results of testing ordinarily involves com- 
paring the performance of a given pupil with that of the group of 
which he is a member. Norms—grade norms, age norms, percentile 
norms, or standard scores—are ordinarily used for this purpose. 4 
norm is usually the average or typical value of a particular psycholog- 
ical characteristic measured in a specified homogeneous population. 


An analysis of pupil errors is also of value in interpreting group per- 
formance. 


Problems for Class Discussion 


T; Assume that you are the principal of an elementary school which has 
four sixth-grade classes. You decide to administer an intelligence test 
and a test of achievement in reading in these classes. 


a. Prepare scales for rating the adequacy of an intelligence test and 
a test of reading ability; use as criteria such factors as validity; 
reliability, objectivity, norms, and ease of administration, scoring; 
and interpretation. à 

b. Examine three intelligence tests and three reading tests suitable for 
use on the sixth-grade level and use your rating scales to determine 
which of the tests in each area you would use. 

c. Assuming that only one of the sixth-grade teachers in your school is 
experienced as an examiner, plan a two-day schedule for administer- 
ing the two tests to the four classes on the grade. 


d. Plan a program for scoring and check scoring the two tests. 


Administrative Aspects of an Evaluation Program 75 


References Cited in This Chapter 


1. Dearborn, W. F., and Smith, C. W., "The Results of Rescoring Five-Hun- 
dred-Thirty Dearborn Tests," Journal of Educational Psychology, 20: 
177-188, March, 1929. 

2. Hildreth, G. H., Manual for Interpreting Metropolitan Achievement 
Tests. Yonkers: World Book Company, 1948. 

8. Madsen, I. N., “Participation in Testing Programs by the Classroom 
Teacher,” Educational Administration and Supervision, 15:117-126, 
February, 1929. 

4. Rinsland, H. D., “A Form for Briefing and Evaluating Standardized 
Tests,” Journal of Educational Research, 42:371-875, January, 1949. 


References for Further Reading 


Lindquist, E. F., editor, Educational Measurement. Washington: American 


Council on Education, 1951. 
Chapter 10 deals with the administration and scoring of objective tests, 


presenting a comprehensive treatment of both topics. 


Orleans, J. S., Measurement in Education. New York: Thomas Nelson and 
Sons, 1937. 

Chapters 6, 7, and 8 present a brief treatment of test administration and 

ably beyond the scope of the present chapter in 


scoring, and go consider d of the 
tion in a typical school situation. 


discussing test interpreta 


Major Evaluation 
PART TWO 


Techniques 


CHAPTER FIVE | Short-Answer Tests 


Written tests used by teachers for measuring pupil 
achievement may be classified as essay examinations or short-answer 
examinations. The latter are often referred to as objective, or “new- 
type,” tests. The essay examination, which generally asks the pupil 
to discuss, compare, give reasons, and the like, requires the formula- 
tion of an extended verbal answer to the question. Short-answer tests, 
on the other hand, consist of questions to which the pupil responds 
by the selection of one or more of several given alternatives, by giv- 
ing or filling in a word or a phrase, or by some other device which 
does not call for an extensive written response. 

In recent years, the essay examination, once the mainstay of the 
- classroom teachers approach to measurement of achievement, has 
been replaced to a considerable extent by the short-answer test. Lee 
and Segel (5), in a comprehensive study of classroom testing prac- 
tices, found that only 16 per cent of high-school teachers use essay 
examinations extensively. In view of their widespread popularity, the 
present chapter considers the more common types of short-answer tests 
—completion, multiple-choice, true-false, and matching exercises. 


Values and. Limitations of Short-Answer Tests 


VALUES 
Short-answer tests, in comparison with essay examinations, possess 
certain definite advantages. 


Sampling In an essay examination, the process of writing a response 

is extremely time-consuming. Use of a short-answer test, where the 

response is quickly given, makes it possible for the pupil to answer 
79 


80 Major Evaluation Techniques 


many more questions in the same amount of time. As a result, short- 
answer tests generally show much better coverage of total course con- 
tent than do essay examinations. Even spotty preparation may result 
in high grades when a pupil happens to study just that material called 
for by a few essay questions. Such chance results operate to a far less 
degree when short-answer tests are employed. In the latter situation, 
the teacher may be more certain that the grades the pupil earns is a 
true measure of his achievement. 


Reliability of Scoring One of the most likely sources of inaccuracy 
in assigning grades to essay examinations is the lack of objectivity in 
grading pupil answers. Sources of the lack of reliability in scoring 
essay questions are discussed in Chapter Six. In view of the fact that 
questions on short-answer tests generally have only one acceptable 
response, objectivity of scoring is relatively high. Clerical errors, 
rather than errors of judgment, reduce reliability of scoring. 


Ease of Scoring Since the short-answer test may be scored by use of 
a key listing correct responses, little technical skill is involved. In the 
usual classroom situation, scoring of essay tests calls for a large ex- 
penditure of teacher time, much of which can be saved by employing 
the short-answer tests. Identification of pupil weaknesses is also an 
easier task when short-answer tests are used. By counting the number 
of errors made on each question or item of the short-answer test, the 


teacher can readily ascertain the particular elements of course content ' 


which show inadequate mastery, and arrange for reteaching. A simi- 


lar analysis of responses to essay questions is an exceedingly laborious 
task. 


Instructional Uses The relatively better sampling and identification 


of pupil errors makes it possible to use a short-answer test as a pre- ^ 


test prior to embarking on a new unit of work. By analyzing pupil 
errors on a short-answer test given before the new unit is introduced, 
the teacher can determine the degree’ to which the new work repre- 
sents material already mastered, or material completely foreign to the 
class. On the other hand, the instructional values of short-answer tests 
when used at the end of a unit of work should not be overlooked. 
The studies of Curtis and Woods (1) and of Plowman and Stroud 
(6) indicate that returning corrected papers to pupils or having them 
correct their own papers prior to a discussion of errors results in bet- 
ter ultimate achievement. 


Short-Answer Tests 81 


LIMITATIONS - 


The limitations of short-answer tests grow out of the nature of the 
responses which are required of the pupil and other factors which are - 
. discussed in the following paragraphs. 


Guessing In short-answer tests where the pupil is called upon to 
select one of a number of possible alternative answers, a series of for- 
tunate guesses will markedly increase the pupil's score. In a test call- 
ing for the recognition of the truth or falsity of 100 statements, for 
example, a group of pupils -vould answer correctly 50 times, on the 
average, simply by following the dictates of a tossed coin. Statisticians 
have proposed several formulae for reducing the effect of guessing. 
In the ordinary course of classroom teaching, however, there is little 
to be gained through the application of any of the proposed correc- 
tions. When pupils respond to every item of a test (the usual class- 
room practice), it can be shown that the relative rank of pupils will 
be the same whether their scores are computed by simply counting 
their correct answers or by the use of any of the common scoring 
formulae which correct for guessing (3). 


Difficulty in Construction The preparation of good short-answer 
tests generally requires considerably more time than the development 
of an essay examination. Not only is it necessary to prepare a “blue- 
print” for the short-answer test as a whole, but the selection of the 
most appropriate short-answer form and the application of the form 
to a given question usually calls for more resourcefulness than the 
formulation of a few simple essay questions. Fortunately, a teacher 
can maintain a file of short-answer questions which have been utilized 
in the past. In this way, he may not only reduce the time involved in 
test construction, but create a reservoir of test items for use in spe- 


cial situations. 


Cost of Administration The large number of items which are in- 
cluded on a short-answer test generally makes it necessary to employ 
some mechanical means for reproducing enough copies of the test for 
class use. In most schools, of course, equipment for duplicating test 
copies will be available. When reproduction is not possible, the 
teacher may find it necessary to resort to oral presentation of the test. 
Several studies (4, 9, 10) have indicated that oral presentation does 
not significantly decrease the validity or reliability of short-answer 
tests. 


82 Major Evaluation Techniques 


Testing Complex Processes In many classroom situations, the teacher 
is concerned not only with the correctness of the pupil's answer to a 
question, but with the correctness of the thought process involved in 
arriving at a correct answer. Some progress has been made in devel- 
oping short-answer tests which attempt to reveal thought processes. 
In one approach, for example, the pupil is asked to justify a given 
conclusion by selecting appropriate statements from a list which is 
presented to him. Even in this kind of question, however, space limi- 
tations restrict the number of possible ways of justifying the conclu- 
sion which may be included in the list of alternatives. In teacher- 
made tests, the essay type generally remains the better instrument for 
testing complex processes. 


Types of Short-Answer Test Items 


Completion Items The completion item requires the pupil to com- 
plete the thought of a sentence by filling in the word or words that 
have been omitted, or it directs him to respond to a question by writ- 
ing his answer in the blank space provided. Because the pupil needs 
to decide upon his answer and then write it out, a test composed of 
completion items takes longer to administer than do other forms of 
objective tests. If the test is timed, moreover, the pupil who writes 
slowly is handicapped. The scoring of the answers is not as fully ob- 
jective as that for other types of items and alternative correct re- 


sponses need to be included in the scoring key. The scoring of the 
item: 


The chief law enforcement officer of the United States 
is 


is not as objective as the scoring of these items: 


The Chief Justice of the U. S. Supreme Court is 
Who is the Governor of California? 


The completion item, however, offers a natural form of questioning. 
It can be used readily with material calling for specific information. 
With questions requiring considerable reasoning or organization on 
the part of the pupil it is useful where the answer can be expressed 
in a few words. 

A somewhat more complicated type of completion item is com- 
monly referred to as the connected-discourse type. Test items put in 
this form have the advantage of providing the respondent with addi- 
tional cues to guide the response. 


Short-Answer Tests '" 83 


Example: 


should be used as a thinner for paint, while 
should be used as a thinner for shellac. 


Several cautions must be observed in constructing completion items. 


l. The blank must call for a simple specific response. For 
example, a poorly worded item would be: 
Wundt established the first laboratory for the experi- 
mental study of psychology in ————. 
This may be correctly answered by writing a date, a city, 
or a country in place of the blank. 
The following item is still more vague in intent: 
On a school fire drill, the teacher must make certain 
that her pupils observe = ., and x 
2. Statements should never be taken directly from textbooks. 
Not only does dependence upon the textbook lead to rote 
memorization on the part of the pupil, but the test con- 
structor will find that verbatim statements tend to be am- 
biguous, unduly wordy, and to contain too many cues. 
8. In the lower grades, use direct questions in preference to 
incomplete declarative sentences, thus: 
Poor: is the capital of-New York State. 
Better: What is the capital of New York State? 


True-False Items The true-false item requires the pupil to express 
his judgment of a given statement by indicating True or False, Yes 
or No, or some similar response. It is adapted to the testing of simple 
facts, ideas, and concepts. Scoring of tests composed of true-false 
items is easy and objective. Such tests seem simple to construct, but 
this apparent advantage is not a real one. In practice, considerable 
care is needed in framing the statements so that the ability to be 


measured is actually revealed. 


Example: 
Litmus paper turns blue in an acid solution ... True False 


True-false questions have been the source of greater irritation to 
pupils than almost any other type of short-answer question. This irri- 
tation, too, is generally expressed by the superior student who may see 
one or more exceptions to a statement which would be acceptable to 
a somewhat less-informed person. The primary difficulty, particularly 
when dealing with concepts, is that of formulating a statement which 


84 


is completely true—that is, true without exception. For pinpoint infor- 
mation, there are fewer difficulties. However, even in the statement, 
"Edison invented the phonograph,” some students may be bothered 
“phonograph” because the term came into common use after 


about 


Major Evaluation Techniques 


Edison's invention. 


In constructing true-false items, care should be taken to avoid sev- 


eral common pitfalls: 


1. 


Avoid broad generalizations. The attempt to develop gen- 
eralizations in true-false items leads to the use of specific 
determiners, words ‘which are usually associated with 
either a true or a false statement. Thus, the words “all,” 
“only,” “none,” “never,” and “always” are generally found 


_in false, rather than true, items constructed by teachers; 


» d 


words such as “generally,” “as a rule,” “most,” “some,” and 
“often” are found much more frequently in true state- 
ments. Careful review of a series of true-false items to 
make certain that such determiners, if used at all, are 
evenly balanced between true and false statements, is a 
necessary step in constructing a true-false test. 


B 


. Avoid testing minutiae. The statement: , 


T F The Henmon-Nelson Test of Mental Ability con- 
tains 90 items arranged in spiral order. 


calls for information which the teacher should not expect 
the pupil to have at his finger tips. Whether the test con- 
tains 90 items or 75 items is inconsequential. 


. Avoid double statements. Each true-false item should test 


a single concept. This item is confusing to students: 


T F The Wechsler Adult Intelligence Scale is de- 
Signed to replace the Wechsler-Bellevue Scale, 
the most widely used individual intelligence test. 

pod be preferable to cover this material in two items, 
us: 

T F The Wechsler Adult Intelligence Scale is de- 
signed to replace the Wechsler-Bellevue Scale. 

T F The Wechsler-Bellevue Scale is the most widely 
used individual intelligence test. 


Avoid long, complicated statements, Pity the poor student 
who is asked to unravel a statement like this: 


T F Beard, in his interpretation of the United States 
Constitution, advanced a thesis of economic de- 


Short-Answer Tests 85 


terminism, pointing a direct relationship be- 
tween holders of the government debt and their 
advocacy of a strong government to pay it off, 
although he later indicated that not only eco- 
nomic factors, but the interplay of political, cul- 
tural, and international forces is also important 
for the interpretation of historical events. 
This item is a measure of reading comprehension. More- 
over, the test-wise student would respond "True," since 
he has learned that most long, involved statements are 
true. Teachers find it easier to write brief and concise false 
statements. Therefore, an attempt should be made to in- 
clude approximately the same number of words in both 


true and false statements. 


Multiple-Choice Items The multiple-choice item requires the pupil 
to recognize which of several suggested responses is the best or the 
correct way to answer a question or complete a statement. While the 
completion item requires the pupil to produce the correct response 
without suggestion, the multiple-choice item calls for recognition only. 
It is adapted to the testing of complex ideas and interpretations. The 
scoring of the multiple-choice item tends to be more objective and 
simpler than that of the completion item. The multiple-choice item 
is superior to the true-false item, which presents only two alterna- 
tives, in that it reduces the opportunity for guessing the correct an- 
swer. The multiple-choice type of question is also relatively free from 
“absolutes” in that the “best” statement of several that are given is 
to be selected as the "correct" answer. The "correct" answer, therefore, 
is relative to several other given statements rather than to all possible 
“not given" statements, as in true-false questions. 

Multiple-choice questions are found in several patterns. Probably 
the most common pattern is the use of a stem, which sets the ques- 
tion, followed by several alternative statements, one of which is as- 


sumed to be the best answer. 


Examples: 
Of the following, which is the most widely used form of 
city government in the United States? 
(1) the city-manager type 
(2) the mayor-council type 
(8) the commission type 
(4) the council-manager type 


86 Major Evaluation Techniques 


In which of the following fields has the cooperative move- 
ment in the United States been most successful? 

(1) Manufacturing and credit 

(2) Marketing and credit 

(8) Consumption and marketing 

(4) Credit and consumption 


Incorrect or unacceptable alternatives in multiple-choice items are 
known as distractors or foils. The formulation of such alternatives re- 
quires considerable care. Distractors should represent misconceptions 
and common errors which actually do arise in the students’ thinking. 
Distractors which are implausible are not likely to be chosen, even 
by a poor student, and therefore will not contribute to the measure- 
ment of his achievement. Distractors, if properly developed, can serve 
as important a function in the question as the correct answer, in that 
they serve as the starting point for diagnosis of individual difficulties. 


The following hints may be of value in constructing multiple-choice 
items: 


l. The stem of the item should pose the central concept 
which is being tested. The following is a poor item: 
A good group intelligence test 
a. is highly reliable 
b. was developed for the U. S. Army in World War I 
c. will give the same average results when adminis- 
tered to unselected groups of boys and girls 
d. should give both verbal and non-verbal IQ's 
In this instance, the distractors refer to four widely dis- 
parate concepts. Each of these concepts—test reliability, 
history of testing, sex differences, types of derived scores 
—might well serve as the stem of an item. 

- Distractors should be as concise as possible. In the follow- 
ing item, only the italicized words in each distractor need 
be used: 

The preferred basis for placing a child in a group for 
instruction in reading in a fifth-grade class is 
a. The mental age obtained by the child on a good 
intelligence test. 
b. The intelligence quotient obtained by the child on 
a good intelligence test. 
c. The chronological age of the child as compared to 
others in the class. 
d. The reading grade obtained by the child on a good 
achievement test, 


Short-Answer Tests 87 


8. Distractors should bg of relatively uniform length. Inex- 
perienced test constructors often make the mistake of in- 
cluding the largest number of words in the correct answer. 

4. Distractors should be grammatically consistent with the 
stem. The stem and each of the distractors should make a 
complete sentence. The following item is poorly phrased: 

Enrollment in colleges and universities in the United 
States 

a. is at its highest level since World War II 

b. is declining at an accelerated pace 

c. more women than men are attending colleges 

d. high financial costs have reduced enrollments 


Matching Items The matching exercise consists of two parallel col- 
umns of words, phrases, or sentences; the pupil is required to match 
or associate each item of one column with the item which corresponds 
to it in the other column. Each matched pair is scored separately. Ac- 
tually, however, the pairs are interdependent because an incorrect 
response may make an item unavailable for correct pairing. For this 
reason, one of the two columns should contain more items than the 
other. Also, it is better to have two short matching exercises than one 
long one. Some test experts recommend three items in one column 
and five in the other as being the best number for reliable results. 


Example: 
A. Score card for rating 1. Buros: Mental Measure- 
standardized tests ments Yearbook (2) 
B. Critical comments on Co- 2. Monroe (editor): Ency- 
operative Science Test clopedia of Educational 
C. Article summarizing re- Research (m 
search on "Evaluation" 8. Review of Educational 
D. Summary of research on Research C ) 


anecdotal records over 
three-year period 
E. Derivation of formula for 
rank-order correlation 
Younger children may be asked to draw lines between the items 
which match. Older children may be directed to indicate the appro- 
priate letter for the matching item. The matching exercise takes 
little space and time on a test, but its usefulness is limited. It is not 
adapted to the testing of complex ideas or concepts. a 
The reader will note that no attempt has been made to describe the 
many variations which have been developed of the basic types of 


88 Major Evaluation Techniques 


short-answer test items presented above, nor have many suggestions 
for writing test items been advanced. Remmers and Gage (7) and 
Ross (8) present relatively complete discussions of the first topic. 
Ebels (2) summary of the principles and pitfalls in writing short-an- 
swer test items will be of great help to the neophyte. 


Testing Complex Processes 


The teacher will generally find that formulating short-answer ques- 
tions which measure acquisition of facts is a far easier task than de- 
veloping materials for use in testing other objectives of instruction. 
The examples given below have been selected as representative of 


approaches to the measurement of pupil achievement in several com- 
mon objectives. 


Recognizing Assumptions In this objective the student, when given 
certain facts and a conclusion drawn from the facts but involving an 
assumption, is expected to recognize the assumption. Such recogni- 


tion may be expected in problems of health, housing, communica- 
tion, etc. 


Example: * 


Statement of Facts: 
Per Cent of Family Members 


Who Received No Medical 
Family Income Attention During the Year 
Under $1200 47 
$1200 to $3000 40 
$8000 to $5000 33 
$5000 to $10,000 24 
Over $10,000 14 


Conclusion: Members of families with small incomes are 
healthier than members of families with large incomes. 


The conclusion is not completely justified by the facts given. It may 
be justified, however, if an assumption is made; that is, if a factor not 
stated in the given facts is taken for granted. What is this factor? 
What must be assumed in addition to the facts given in order that 


1 Heil, Louis M., et al, The Measurement of Understanding in Science. In The 
Measurement of Understanding, National Society for the Study of Education, 
45th Yearbook, Part I, 1946, p. 127. 


Short-Answer Tests 89 


the conclusion be true? A multiple-choice question, in which the cor- 
rect assumption is listed with several alternatives, is most appropriate 
in this instance, thus: 


Assumption: (Select one) 
(1) Wealthy families had more money to spend 


for medical care. 

(2) All members of families who needed' medical 
attention received it. 

(8) Many members of families with low incomes 
were not able to pay their doctor bills. 

(4) Members of families with low incomes often 
did not receive medical attention. 


Interpretation of Data In this objective the student may be expected 
to recognize when an interpretation goes beyond the data and when 
an interpretation is within the data. Data are, therefore, presented for 
the student to consider, together with interpretations to be judged. 
The question then becomes “Is the interpretation completely justified 
by the data alone? Is it contradicted by the data or are the data in- 
sufficient to judge its truth or falsity?" In this objective, the pupil may 
be aware (or not be aware) of the various ways in which interpreta- 
tions go beyond the data—by extension (extrapolation), by imputing 
cause or effect, by not having enough cases to warrant the conclusion, 
etc, Thus, interpretations involving these ideas, as well as statements 
which essentially describe the data, may be listed for the student to 
judge. In this kind of question, the natural form of the short-answer 
question is one in which the student is asked to "key" the individual 
statements according to a code. Such a code may be: 


Indicate by: 


A. Any stat 
stated 
B. Any statement which is probably true on the basis of the 


facts stated 


C. Any statement for whicl 
judge truth or falsity 


D. Any statement that is probably f 


facts stated ^ 
E. Any statement which is false on the basis of the facts 


stated 


ement which is true on the basis of the facts 


h insufficient data are given to 


alse on the basis of the 


90 Major Evaluation Techniques 


In a study of social studies achievement, several tests were admin- 
istered to pairs of pupils in four experimental and four control classes 
in the ninth grade. The pairs were matched on the basis of chrono- 


logical age, IQ, and socio-economic status. The following test results 
were obtained: 


NUMBER OF 
MATCHED AVERAGE PERCENTILE 
TEST PAIRS SCORES 
Experimental Control 

Information and Facts 178 56 56 
Skills in Obtaining Facts 193 57 48 
Skills in Organizing Facts 181 58 51 
Interpreting Facts 182 58 47 
Applying Generalizations 184 58 56 
Social Attitudes 182 51 58 


l. The results indicate that pupils in the experimental classes are as 
proficient in skills and information as pupils in control classes. 

2. Test results show that pupils in the experimental classes have bet- 
ter mental ability and character traits than pupils in control classes. 

3. Pupils in experimental classes achieve higher test scores on most 
of the tests than pupils in control classes. 

4. In information and attitudes in the social studies, pupils in con- 
trol classes make better test scores than experimental class pupils. 


5. Pupils in experimental classes achieve better on some tests because 


they have had more practice in taking such tests than pupils in 
control classes. 


The test results prove that experimental classes are superior to con- 
trol classes in the objectives tested. 


Except on the information test, the differences between the two 
types of classes is statistically significant. 


8. Skills and abilities especially stressed in the curriculum of the ex- 
perimental classes were obtaining facts and interpreting facts, 


Recognition of Limitations Educational objectives involving recogni- 
tion of limitations are expressed in many different ways. The follow- 
ing examples are presented to illustrate ways in which this objective 
may be translated into short-answer questions. 


In the research problem presented below, ten conditions that may 
or may not be necessary for its validity are listed. Check the three 


Short-Answer Tests 91 


conditions that are most important for the validity of each of the 
studies. 


An investigator wishes to determine the value of home- 
work for retarded children. To one class of retarded chil- 
dren, to be taught by teacher "A," he plans to give spell- 
ing homework daily for two months, During this period, no ' 
homework will be given to a group of equivalent mental 
status and IQ, to be taught by teacher "B." The two classes 
will study the same words. It is planned to compare the rela- 
tive progress of the two groups on the basis of frequent class 
tests and a final standardized scale at the end of the experi- 
mental period. 

His results will be valid if: 

l. Bias introduced into the study by the fact that the two 
groups are to study the same words will be insignificant. 

2. There are no important differences in teacher ability. 

8. There is no carry-over between experimental factors. 

4. The curve for learning spelling words is similar to the 
usual learning curve. 

5. The groups are representative of the normal school popu- 
lation. 

6. The weekly tests given to the two groups are similar only 
in minor points. 

7. No homework in subjects other than spelling is given to . 
the control group. 

8. The two groups are actually equivalent. 

9. The tests measure all the important gains likely to result 
from the use of homework. 

10. All the words to be studied during the experimental 
period will appear on the standardized test administered 
at the end. 

An investigator is interested in determining the accuracy of treat- 
ment of World War II in certain history texts used in the high schools. 
Rank, using 1 for best, 2 for next best, etc., the adequacy of the fol- 
lowing techniques in solving the problem. 

1. Compare the number of pages allowed to that topic in the 
text to the number of pages allowed to topics of similar 
importance. 

2, Compare the treatment of World War II in that book 
with the point of view expressed in recognized advanced 
treatises on the subject. 


92 Major Evaluation Techniques 


3. Compare the treatment of the topic in this book with its 
treatment in other history texts in use in the city schools. 
4. Secure the judgment of expert military figures who par- 
ticipated in World War II on the accuracy of the treat- 
ment in the text. 
. 5. Determine the likely effects of the treatment in this text 
upon the character and personality of the child. 
6. Investigate the professional standing and political point 


of view of the author of the text in order to discover pos- 
sible causes of bias. 


Application of Principles The behavior expected from students when 
applying principles is usually described as making a prediction or 
an explanation in a situation in which a science or social science 
principle or generalization may validly be applied. Also important is y 
the ability to state the principle employed in making the prediction or 
explanation, as well as the conditions which must exist if the principle 
may be employed. 


Example: ? 


John prepared an aquarium as follows. He carefully cleaned 
a ten-gallon glass tank with salt solution and put in a few 
inches of fine washed sand. He rooted several stalks of weed 
(elodea) taken from a pond and then filled the aquarium 
‘with tap water. After waiting a week, he stocked the aquar- 
ium with two one-inch goldfish and three snails. The aquar- 
ium was then left in a corner of the room. After a month the 
water had not become foul and the plants and animals were 
in good condition. Without moving the aquarium he sealed 
a glass top on it. The sealed aquarium will probably remain 
in good condition for several months.( ) 


Directions: If you are uncertain about the truth or falsity of the un- 
derlined conclusion either because the problem is inadequately stated 
or for any other reason, indicate your uncertainty by placing the 
letter “U” immediately after the underlined conclusion. 

If you believe that the underlined conclusion is quite likely to be 
true, place a ^T" immediately after the underlined conclusion. 


If you disagree with the underlined conclusion, place a “D” after 
the underlined conclusion. 


2 Heil, Louis M., et al., The Measurement of Understanding in Science. In The 
Measurement of Understanding, National Society for the Study of Education, 
45th Yearbook, Part I, 1946, p. 111. 1$ 


` Short-Answer Tests 


Reasons: If you were uncertain (“U”) about the conclusion, select 
from the first 10 reasons given below all those which help you explain 
why you were uncertain. Indicate each reason so chosen by placing a 


check mark in the parentheses in front of the reason. 


RU you believe the conclusion to be true (“T”) or if you disagree 
(“D”) with the conclusion, select from reasons No. 11 through No. 24 
all those which help you to explain your decision thoroughly. Indicate 


which reasons you choose by placing a check in front of each. 


Reasons to be used if you are uncertain: 


( 
( 
( 


( 
( 


) 
) 
) 


PE 
)9 
) 10. 


1 
2. 


8. 
. The amount of exposed water surface is an impor- 


Many people who keep fish in bowls change the 
water frequently. 

It is difficult to know what is meant by the term 
“good condition.” 

Not all of the aquaria I have seen were sealed. 


tant factor in keeping a sufficient amount of oxygen 
in the water to support life. 


. Some water plants produce more oxygen than 


others. 


. Too few fish in a large aquarium will affect the con- 


dition of the plants in the aquarium. 


. The amount of direct sunlight the aquarium re- 


ceives is an important factor in determining whether 
or not the aquarium will remain in good condition. 


. Some fungi harmful to aquatic life develop more 


rapidly when the oxygen supply is cut off. 


. I do not know how sealing an aquarium would 


affect the plants and animals in it. 
It is important to know the amount of harmful 
chemicals, such as chlorine, used in the tap water. 


Reasons to be used if you agree or disagree: 
( ) 11. The balance between plants and animals is attained 


in an aquarium when each supplies the needs of 
the other. 


. The water plants and micro-organisms would not. 


grow rapidly enough to supply sufficient food for 
the fish. 


. Aquaria in biology classrooms are also kept in 


balance for several months even though sealed. 


. Plants in a sealed aquarium continue to manufac- 


ture food that is utilized in their growth, and they 
in turn serve as food for animals. 


94 Major Evaluation Techniques. 


( ) 15. Just as organisms live at great depths in the ocean 
where there is little oxygen, so can fish live in a 
sealed aquarium. 

(  ) 16. Clerks in pet shops say that a balanced aquarium 
can be maintained for a long time even if sealed. 

( ) 17. In a sealed aquarium, sufficient oxygen cannot be 
absorbed from the enclosed air to supplement the 
oxygen given off by the plants. 

t ) 18. A balance in an aquarium tends to be maintained 
as long as one of the interdependent factors does 
not become predominant. 

( ) 19. It is possible to maintain a balance between plants 
and animals in a region. 

( ) 20. If undisturbed, nature will strike a balance between 
plants and animals in a region. 

( ) 21. Just as one does not need to feed fish in a pond, so 
one does not need to supply food in an aquarium 
containing an abundance of plant and animal life. 

( ) 22. Anyone who has studied biology should know that 
a sealed aquarium can be maintained in balance 
for several years if undisturbed. 

( ) 23. The snails in an aquarium can devour the solid 
waste material and excess algae in the water. 

( ) 24. The animals in the sealed aquarium can breathe 
dissolved oxygen supplied by the plants. 


Planning a Short-Answer Test 


Unless a short-answer test is well planned, the result is apt to be 
an inaccurate test, a waste of time, or both. The most significant fac- 
tors to consider in planning are the educational objectives to be rep- 
resented by the test, the nature of the individual questions in relation 
to such objectives, the extent to which different kinds of situations 
and information are probed or sampled by questions, and the ways 
the responses by students are to be summarized. 


EDUCATIONAL OBJECTIVES AS THE 
STARTING POINT IN PLANNING A TEST 


Such objectives, of course, are of several kinds. Some specify the 
acquisition of information; some indicate the application of informa- 
tion, including problem solving and the analysis of argument; others 
specify the ability to interpret new information; still others wil for 
certain abilities or skills such as reading, writing, and arithmetic, 


t 
Short-Answer Tests 95 


For a short-answer test (or any other test for that matter) to be 
valid, the specific questions of the test should require the student to 
do those things (recall information, apply information, interpret) spe- 
cifically called for by the educational objectives for which the test is 
being developed. If this requirement is to be fulfilled, such educa- 
tional objectives must be defined in terms of behavior. For example, 
“To understand certain principles or generalizations of science or so- 
cial science" may be an educational objective of a particular teacher 
or of several grade levels in a school. If so, the question arises “What 
will students be able to do if they understand such principles or gen- 
eralizations? What behavior will they be able to demonstrate?” One 
illustration of such behavior is that they will be able to recall specific 
examples of such principles or generalizations. Another may be the 
ability to explain new examples in terms of such generalizations or to 
predict the outcomes in particular new problems or situations by the 
use of such principles. 

Often even this detailed definition of student behavior is not ade- 
quate for proceeding with the construction of a test question. For 
example, if the student is to recall examples of a principle or a gen- 
eralization, is he expected, when given a statement of the principle 
or generalization, to give or recognize illustrations of the principle? 
Is he to give or recognize the principle if given several illustrations 
of it? Is he expected to indicate the limitations of the principle? The 
conditions under which it is not valid? These are the kind of specific 
questions to be answered in planning the short-answer test. The 
clearer the answers to such questions, the more likely it is that a 


valid test will result. 


A second major question to be answered in planning the test is “In 


what context or in what situations should the student be able to 
demonstrate the desired behavior?” If the student is to apply known 
principles in explaining new situations or in making predictions in 
problems not dealt with in direct instruction, how complex should 
such problems be and how similar should they be to problems which 


have been dealt with in instruction? 


OUTLINING THE TEST 

Once the objectives of the test have been drawn up, an outline of 
test content should be prepared. Unfortunately, the term “test con- 
tent” has generally been used to denote only the subject matter of the 
test. As a result, test outlines are often restricted to an analysis of 
specific content. In order to make sure that a test will sample those 


96 Major Evaluation Techniques 


aspects of behavior which are deemed important, a test outline should 
also indicate the type of behavior to be elicited with respect to each 
subject matter or content area. A convenient way of identifying both 
aspects of test content is. through the use of a two-way chart. The 
example on page 97 presents a sample test outline developed for a unit 
on learning in a first course in Educational Psychology and shows the 
way in which the outline may be translated into a two-way chart. 
This two-way chart is by no means complete. Many spaces have not 
been filled, nor has the chart been extended to include all of the 
material normally covered in a unit on learning. The reader will find 
it of value to attempt to develop a more complete formulation of a 
similar test outline and chart. It is generally helpful, in preparing 
such two-way tables, to include an extra row and column in which 
allocations of weight are given to each major content field and to each 
objective. This should be done in terms of point values on the basis of 
an appropriate scale. 


A TEST IS A SAMPLE OF MANY POSSIBLE QUESTIONS 


An important assumption in the construction and use of a short- 
answer test, just as in any other test, is that pupil achievement on the 
test is a true indicator of achievement in the objectives of instruc- 
tion for which the test has been developed. Implicit in this idea is the 
assumption that the quality of the answers to specific questions is 
indicative of what the quality of answers would be to other questions 
involving the same objective and content. Thus, an objective may 
specify that students be able to recognize assumptions in scientific 
situations. If so, a test would contain questions which would give stu- 
dents an opportunity to recognize assumptions in scientific situations. 
Such a test would, of course, contain but a sample of many possible 
scientific situations, and the student's success in such a test would be 
taken as an indication of equal success in another sample of such 
Situations. The student's success would not be taken as an indication 
of his ability to identify assumptions in social problems. If instruc- 
tion has provided practice for the student to identify assumptions in 
the latter type of situation, such situations can validly be included as 
à part of an achievement test. 

The adequacy of the sample of possible situations depends, of 
course, upon the purpose of the test. If the test is designed to esti- 
mate the achievement of individual pupils, the sampling must be 
more complete than when the test purports to measure the achieve- 
ment of a group of pupils. A test for estimating group achievement 


Short-Answer Tests 


SAMPLE TEST OUTLINE 


97 


Educational Psychology—Unit on Learning 
Objectives: 


1. Acquisition of facts concerning the learning process. 
2. Formation of generalizations about the learning process. 
8. Application of knowledges gained to classroom situations. 


Content (Major topics covered): 


A. "Types" of learning 
B. The learning curve 
C. Forgetting 

D. Theoretical explanations of the learning process 
E. The hygiene of school learning 
F. Transfer of training 


TWO-WAY CHART FOR COURSE IN EDUCATIONAL PSYCHOLOGY 


Acquisition of 


Generalizations 


Applications 


Content Facts 
"Types" of Meaning of: 
earning Trial-and- How can trial- 
error learning and-error be re- 
duced? 
Perceptual Died in per- 
arnin, ceptual responses 
ae of adults and 
children 
Association Role of drill inas- | Multiple sense 
of ideas sociative learning appeal 
Concept Role of experien- 
formation tial background 
Learning Characteristics: Causative factors 
Curves Initial spurt 
Plateau How can the 
teacher help a 
pupil who is on 
a plateau? 
Short-time How ae short- 
ations time fluctuations 
inc be handled? 
Limits 
Forgetti aracteristics of | Role of: Teacher use of 
E cie curve Whole vs. part each of these in 
learning; reducing forget- 
Spaced vs. ting 
massed practice; 
Recitation; 


Social partici- 
pation 


Pio! |) yee ee pualto e 


98 Major Evaluation Techniques 


can, in general, be shorter than a test for estimating individual achieve- 
ment and still have comparable certainty of measurement. 


SUMMARIZING PUPIL RESPONSES 


Short-answer tests, more than any other, are constructed so that 
pupil responses may be summarized in part scores, or subscores. Since 
any pupils achievement varies in different objectives or instruction, 
part scores are helpful in guiding the teacher's remedial or instruc- 
tional work and in aiding the pupil in his self-evaluation. 

Part scores on a test may be determined for different dimensions 
of a subject. For example, in arithmetic part scores may be developed 
for achievement in addition, subtraction, fractions, multiplication, 
etc. Part scores may also be utilized in measuring different types of 
response to the same set of questions. For example, on a test on inter- 
pretation of data, in addition to a "rights" score, the student's re- 
Sponses may be analyzed in terms of the per cent of total opportunities 
to go beyond the data, to be over-cautious, and to make crude errors 
in judgment. 

Short-answer tests lend themselves to part score analysis more than 
do other types because of the greater number of specific responses 
obtained in the short-answer test. The test constructor and user should 
be aware, however, of the fact that part scores are almost always less 


reliable, subject to greater uncertainty of measurement, than are 
total scores. 


Summary 


Short-answer tests, which include completion, true-false, multiple- 
choice, and matching items as basic types, are widely used in measur- 
ing pupil achievement. Such tests are generaly superior to essay 
examinations in their sampling of course content, reliability of scor- 
ing, and ease of scoring. They also lend themselves to a number of 
instructional uses. 

The limitations of short-answer tests include difficulty of construc- 


tion, failure to eliminate guessing, and greater cost of administration. 
Efforts to provide short-answer t 


ests which will measure complex 
processes have not 


yet been completely successful To insure maxi- 
mum validity in short-answer tests, they should stem from an analy- 
Sis of objectives of instruction, be expressed in terms of expected pupil 
behavior, and should be based upon a test outline which identifies 
both course content and the type of behavior to be elicited. 


! 


Short-Answer Tests 99 


Problems for Class Discussion 


l. Prepare ten true-false questions and five multiple-choice questions de- 
signed to test pupil understanding of thé major concepts presented in 
this chapter. For each item you construct, indicate the objective to which 
the item is directed. 

2. Obtain a copy of the course of study in social studies used in your com- 
munity. Prepare a ‘two-fold chart, similar to that on page 97, which 
would serve as the base for building a test for one of the units outlined 


in the course of study. 


References Cited in This. Chapter 


1. Curtis, F. D., and Woods, G. G., “A Study of the Relative Teaching 
Values of Four Common Practices in Correcting Examination Papers,” 
School Review, 37:615-623, 1929. 

2. Ebel, Robert L., Writing the Test Item. In E. F. Lindquist, editor, 
Educational Measurement. Washington: American Council on Educa- 
tion, 1951, p. 185-249. 

8. Holzinger, Karl J., "On Scoring Multiple Response Tests," Journal of 
Educational Psychology, 15:445-447, 1924. 

4. Jensen, M. H., “An Evaluation of Three Methods of Presenting True- 
False Examinations: Visual, Oral, and Visual-Oral,” School and Society, 
82:675-677, 1930. 

5. Lee, J. Murray, and Segel, David, Testing Practices of High-School 
Teachers. United States Office of Education Bulletin No. 9, 1936. 

6. Plowman, Letha, and Stroud, J. B., “The Effect of Informing Pupils of 
the Correctness of Their Responses to Objective Test Questions,” Journal 
of Educational Research, 36:16-20, 1942. , 

7. Remmers, H. H., and Gage, N. L., Educational Measurement and Evalu- 
ation. New York: Harper & Brothers, 1948, p. 150-180. 

8. Ross, C. C., Measurement in Todays Schools. New York: Prentice- 
Hall, 1947, p. 103-157. 

9. Sims, Verner M., and Knox, L. B. “The Reliability and Validity of 

. Multiple-Response Tests When Presented Orally," Journal of Educa- 
tional Psychology, 23:656-662, 1932. 

10. Stump, N. F., "Listening Versus Reading Method in the True-False Ex- 
amination," Journal of Applied Psychology, 15:552-562, 1931. 


References for Further Reading 


National Society for the Study of Education, The Measurement of Under- 
standing, 45th Yearbook, Part I. Chicago: University of Chicago Press, 
1946. 

Each of the chapte 
ment of understandin| 


4 


rs in Section II of this book deals with the measure- 
gina single subject-matter area. The student will 


100 i Major Evaluation Techniques 


find many illustrations of approaches which might be used to measure 
various aspects of achievement. 

Lindquist, E. F., editor, Educational Measurement. Washington: American 
Council on Education, 1951. 

Chapters 6 and 7 are definitive treatments of the planning of short- 

answer tests and the writing of test items. 

Smith, E. R., and Tyler, R. W., Appraising and Recording Student Progress. 
New York: Harper & Brothers, 1942. 


Describes the approach used in developing some of the short-answer 
tests used in the Eight-Year Study. 


Essay and 


CHAPTER SIX z n 
Oral Examinations 


In spite of the widespread: use of "new-type," "ob- 
jective," or *short-answer" tests, examinations calling for responses in 
the form of an essay and classroom situations which require oral 
responses to questions still constitute an important aspect of the 
evaluation of pupil performance. Proponents of more "modern" forms 
of testing have been prone to give the more traditional techniques 
but slight regard. Airy dismissal of the old, without consideration of 
the.values which might thereby be discarded, obviously does not 
constitute sound practice. This chapter analyzes both the values and 
the limitations of essay and oral examinations. 


The Essay Examination 


THE ESSAY EXAMINATION IS STILL POPULAR 

amination calls for a relatively free written response 
hich the written answer, when properly 
s information regarding selected aspects 
tioning of the pupil's mental life. The 
d the continued criticism of experts in 
mains an approach widely used by 


The essay ex 
to a problem situation, in w 
analyzed by the scorer, reveal 
of the organization and func 
essay examination has survive 
educational measurement, and re 
classroom teachers in achievement testing (11). 

"The relative popularity of the essay examination is not difficult to 


understand. The most widely used tests are those which are pre- 


pared, administered, scored, and interpreted by the classroom teacher. 

They are generally constructed for use only with those pupils en- 

rolled in the course taught by the teacher who prepares the test. He 

may have many purposes in mind in administering the test—to moti- 

vate his pupils, to determine the success with which he has taught a 
101 


102 Major Evaluation Techniques 


unit of subject matter, to encourage additional study. The prepara- 
tion of valid objective-type testing materials not only requires more 
time, but calls for training and experience which the classroom teacher 
seldom has had an opportunity to obtain. Nearly every teacher, how- 
ever, looks upon himself as expert enough to construct a suitable 
essay test in his own subject, and to arrive at a satisfactory grade for 
a pupil. Technicalities such as adequate sampling of content, test 
validity, and test reliability are too often dismissed or ignored. 


OBJECTIVES MEASURED BY ESSAY EXAMINATIONS 


Valid educational outcomes exist which do not readily lend them- 
selves to testing via so-called objective techniques. Many of these 
objectives may be subsumed under the general term “higher mental 
processes." At best, a short-answer test of the multiple-choice type, 
for example, can serve only as a means for collecting evidence con- 
cerning how well the pupil can judge the best of several alternative 
hypotheses which are presented to him. Does this constitute the same 
kind of judgment which is involved in the formulation of a hypothesis 
in a problem? While the research evidence which is available does 
not permit a definitive answer to this question at the present time, it 
appears obvious that the use of short-answer questions makes it 
extremely difficult to "follow the student's thinking" in such a higher 
mental process as formulating hypótheses. 

The student's ability to organize and express his ideas effectively 
is another objective which lends itself to measurement via the essay 
examination. Although many attempts have been made to develop 
valid short-answer devices for appraising a pupil's achievement in 
this objective, the results obtained have not been sufficiently con- 
vincing to merit their acceptance. In this connection, it should be 
noted that the use of the essay examination makes it possible for the 
teacher to direct the attention of the pupil to large segments and in- 
tegrated units of subject matter. A question such as "Compare the 
accomplishments of the F. D. Roosevelt and the Eisenhower adminis- 
trations in the field of domestic affairs during their first year of office," 
forces the pupil to consider the available facts, select those which 
are pertinent, and express his conclusions in his own words. 

One must be careful, however, not to claim too much for the essay 
examination. Many writers, for example, maintain that the essay ex- 
amination may be used to arrive at an estimate of the creative ability 
of the pupil. While it is true, of course, that the ability to organize 
materials and the ability to formulate hypotheses are, in part, creative 


Essay and Oral Examinations 103 


in character, an exact definition of “creativeness,” and an analysis of 
the extent to which this trait is actually operative in the testing 
situation, have yet to be presented. Until a more precise concept of 
creative ability is advanced, it appears to be unwise to place too great 
stress upon this value for the essay examination. 

The pupil's response to an essay examination has also been looked 
upon as a source from which the teacher could glean some insight 
into the pupil's personality. Sims (9), in emphasizing the value of 
essay questions as a projective device, points out that test responses 
can furnish clues to the dynamics of the pupil's mental functioning. 
Unfortunately, the number of instances in which a pupil's responses 
to the usual essay question lend themselves to interpretation in terms 
of psychological adjustment is very small. While the teacher may, in 
isolated instances, uncover an unexpected manifestation of a per- 
sonality characteristic, it is well to look upon the essay test as a meas- 
ure of the pupil's achievement, rather than as a device from which 
to draw inferences concerning personality structure. 


CRITICISMS OF THE ESSAY EXAMINATION 

One of the most important characteristics of any test is the con- 
Sistency with which competent examiners evaluate the responses 
of the examinee. The principal criticism of the essay examination has 
been directed to this point—the evaluation of answers to essay ques- 
tions is unreliable. A very large number of studies have dealt with this 
factor of “reader reliability.” From the early studies of Starch and 
Elliot (15) down to those conducted in recent years, research workers 
have consistently found marked variability in the marks or grades 
assigned to the same paper by two or more readers. 

Several factors operate to make the marks assigned to essay ques- 
tions unreliable. Often, different scorers disagree concerning the ob- 
jective which is being measured—is the question stressing selection 
of facts, interpretation, or use of English? Scorers may disagree, too, 
in the weight which is assigned to the several elements of a response, 
in standards of grading responses of varying quality, or even in in- 
terpretation of the purpose of the question. 

Extraneous elements may enter into the evaluation of pupil re- 
sponses. Every experienced teacher is familiar with the student who 
says nothing, but who says it very well. Too often, he receives a grade 
based more upon his ability to express himself than upon his under- 
standing of the subject. Comparison of one paper with others graded 
by the teacher may result in an improper mark—a mediocre paper 


104 Major Evaluation Techniques 


looks very good to a teacher who has just read a series of poor papers; 
an average presentation seems very weak when preceded by a series 
of excellent papers. The teacher must also guard against a tendency 
to grade not only the answer which is before him, but also the work 
of the pupil in his class during the course of the school year. 

The essay examination has also been criticized on the grounds that 
the sampling of content, or range of information tested, is narrower 
than it is in objective examinations. When the teacher administers a 
100-item multiple-choice test, the pupil is called upon to make 100 
independent responses. Rarely are an equivalent number of inde- 
pendent responses to be found in a pupils reaction to the limited 
number of essay questions which can be asked in the same period of 
time. 

The limited scope, in terms of sampling, of the essay examination, 
places a disproportionate emphasis upon correct interpretation of each 
question by the pupil. Failure to understand what the teacher has in 
mind on one essay question penalizes the pupil severely. The pupil 
who misinterprets two or three objective questions does not suffer 
such serious damage. 

Another criticism leveled against essay examinations constructed 
by teachers is that emphasis is placed upon the recall of more or less 
specific information. In spite of the fact that teachers maintain that 
their courses are designed to guide the pupil to think critically, to 
interpret facts, and to organize materials, Sims (8) found in his study . 
of a group of essay questions that over 35 per cent called for no more 
than simple recall of memorized information. While recognition and 
recall of information are important, such outcomes can be more 
validly and reliably measured by objective fixed-answer test items 
than by a free response essay examination. 


IMPROVING THE ESSAY EXAMINATION 


Much of the criticism of the essay examination is due to poorly 
constructed tests made by teachers for use in their classes. Many 
studies (7, 12, 14, 17) have indicated, however, that when appro- 
priate care is taken in constructing and grading essay examinations 
consistently good results may be obtained in appraising a pupils 
performance. Suggestions for improving the essay examination are 
summarized below: 


1. Each question in an essay test should be planned to measure one 
defined objective of instruction for which no valid or reliable short- 
answer test is available. 


Essay and Oral Examinations 105 


2. 


Essay questions should be formulated to require a definite, re- 
stricted answer for the objective tested. Questions should avoid 
being vague— "Discuss the following . . . ," “What does the above 
show?", or even "Write a half page on the following topic." To 
formulate essay questions of this kind is to, play a “cat and mouse 
game" with the student. Sometimes this procedure is rationalized 
by the teacher who "wants to see what the student can do." Such 
questions force the pupil to put his primary attention on making 
shrewd guesses concerning what it is that the teacher desires, rather 
than on the real problem of giving those specific responses which 
will demonstrate his achievement. Another reason why questions 
are often stated in general terms is that the teacher feels that by 
framing specific questions he is "giving away the answer." The 
real point is probably that the formulation of questions definite 
enough for the student to know exactly what he is.expected to do 
is not an easy process. With a little thought, however, questions 
can be developed with sufficient specificity without revealing the 
answer to the student. For example, the question, "Compare the 
League of Nations and the United Nations," may be improved by 
giving the pupil some help in framing his response as follows: 
"Compare the League of Nations and the United Nations with 
respect to (a) aims and purposes, (b) administrative structure, and 
(c) success in settling disputes among member nations." The more 
elaborate form not only gives the pupil a clearer definition of the 
intent of the question, but it allows for the development of a more 
reliable grade. 

The pupil should not be allowed to make a choice among several 
questions. The use of optional questions makes it almost impos- 
sible to arrive at comparable scores for pupils who have answered 
different questions, since the equating of two or more questions 
for difficulty constitutes a complex problem (13). Moreover, the 
judgment of the pupil concerning which question he can answer 
to best advantage may be wrong, as Meyer (6) has demonstrated. 
In grading essay examinations, a standard answer should be for- 
mulated in which a specific number of credits is allotted to each 
significant point which the pupil is expected to make. A sampling 
of pupil responses should then be read, in order to determine 
whether the respondents have interpreted the question correctly, 
and whether any changes in the standard answer are necessary. 
The following illustrations and suggestions, adapted from those 
used by the agency responsible for the selection of teachers for a 


106 Major Evaluation Techniques 


large city school system, will serve to clarify the process of devel- 
oping standard answers and allotting credit values. 


a. When there is a limited range of choice among acceptable an- 
swers or parts of answers, all or most of such possible choices should 
be included in the standard answer. 


Example: 


Question: What are five of the most common and most rep- 
rehensible ways in which time is wasted by inefficient teach- 
ers during an arithmetic lesson? In each case state what 
measures an efficient teacher should adopt to prevent such 
loss of time. (15) : 

Standard Answer: Three points were given for each of any 
five of seventeen listed instances of wasted time together with 
the appropriate preventive measure or measures. 

Time may be wasted: 


1. In taking too much time for the distribution of material. 
(For full credit an adequate means of mechanizing the 
routine was expected.) 


2. In ruling paper and preparing heads with too great elab- 
oration. (For full credit the proper preparation of the 
papers was to be indicated.) 


3. In dictating problems when they may be presented in a 
shorter or easier way. (Acceptable means of correcting 
above: use of mimeographed sheets, use of textbooks, 


problems written on blackboard before the arithmetic 
period.) 


16. In over-rationalization in the presentation of a new topic. 


(For full credit an illustration of proper use of rationali- 
zation was expected.) 


17. In failure to grade difficulties in the presentation of a 


new topic. (An illustration of proper grading was ex- 
pected.) 


b. As in the preceding example, a question may call for a specified 
number of “values,” “examples,” “reasons,” “instances,” etc, When, 
however, the number is not specified in the question, the standard 
answer should definitely state how many of these should be required 
for a maximum rating. In such case, it is usually fair to provide for 
a degree of flexibility by taking into consideration the fullness and 
clarity of treatment given by the applicant to each of the separate 


Essay and Oral Examinations 107 


values, examples, reasons, etc., adduced by him. Not all points for 
which credit is allowed need be given equal value; obviously, the 
weight given to an item should be in proportion to its significance 
or its importance. 


Example: 


Question: In a paragraph of about fifty to seventy-five 
words, state, with reasons, the value and the limitation of the 
use of model compositions in instructing children to write 
English well. (15) 

Standard Answer: It is reasonable to expect that at least 
ONE important "value" and ONE important “limitation” 
shall be EFFECTIVELY developed for a maximum answer. 
In lieu of a thoroughly effective development of ONE point, 
a less convincing treatment of two or three important points 
may be accepted as worth the maximum rating. 


c. If the possible range of answers is very wide, it may not be prac- 
tical to do more than to suggest one or more typical answers and to 
lay down certain general principles that should govern the rating. 


Example: 


Question: State, with reasons, six measures or precautions 
which a teacher should take in order to safeguard the eye- 
sight of the children in his class. (12) 

Standard Answer: There are a very large number of pos- 
sible acceptable answers. These may be drawn from consid- 


erations involving care 


in schoolroom lighting, 

in use of the blackboard, 

in adjustment of furniture, 

in furnishing books and 
gienic requirements for children, 

in teaching them to break or to avoid deleterious eye hab- 
its, 

in identifying pupils with visual defects, 

in getting such pupils under treatment, 

in seeing to it that they wear their eye-glasses, etc. 


Any six acceptable points with suitable “reasons” should be 
given maximum rating. “Reasons” are likely in most cases to 
be rather general, but should be accepted if correct; medical 
technicalities should not be required. 


paper in accordance with hy- 


108 


Major Evaluation Techniques 


The points should, of course, be mutually exclusive. If an 
applicant says, first, that the teacher should discover which 
children have defective vision, and, later, that a teacher 
should periodically examine the eyes of children, these two 
partial reasons should receive only two points credit, not four. 


d. Answers of the type referred to in the three preceding sections 
usually range in quality from altogether acceptable answers worthy 
of maximum rating, through answers worthy of partial credit, to 
altogether unacceptable answers. An answer or a subdivision of an 
answer may deserve only partial credit because it is only partially 
true, because it is too general, because it is true only in a limited 


"number of instances, because it represents too minute a subdivision 


of the thought, because it is in the nature of a repetition or over- 
lapping of some other part of the answer, or because of other faults. 
The standard answer should, when practicable, give typical in- 
stances of answers for which partial credit may be given. In each 
case the amount of credit permissible should be stated. 


Example: 


Question: Describe how a first-grade teacher may develop 
in the minds of her pupils a clear conception of the magni- 
tude of a quart (4). 

Standard Answer: For maximum credit an applicant should 
give two superior devices or three acceptable devices. Scale 
accordingly, one superior device being worth two points; one 
acceptable device, 1.8 points; one inferior device, 0.7 points 
or less to 0. 


Superior 


1l. Teacher brings to school and allows children to SEE AND 
HANDLE 'common forms of quart measures in commer- 
cial use, as quart-sized milk bottle, quart oil measure used 
In garages, quart measures used by housewives, quart 
berry box, etc. 

2. Children estimate the number of quarts in vessels such as 


pitcher, aquarium, etc., and then verify their guesses by 
actual measurement. 


Acceptable 
8. Teacher shows children quart measure. 


4. Teacher compares quart measure with pint and gallon by 
actual measurement of liquids. (Not superior because 
pint and gallon are taught in second grade.) 


Essay and Oral Examinations 109 


Inferior i 
5. Teacher reminds children of quart measures they have 
seen: 
6. Teacher compares dry with liquid quart measures. 
7. Children tell what they have bought by the quart. 
e. The standard answer should subdivide the point allowance for 
arly state how many points should be allowed ` 


the question and cle: 
h “sub-subdivision” of the question. 


for each subdivision and eac 


Example: 

Question: Explain and illustr: 
tendency away from class or mass 
today." (10) 

Standard Answer: 
ing four (4) for the explanation and six 


tration. 


ate the following: "There is a 
teaching in our schools of 


The ten credits were subdivided, allow- 
(6) for the illus- 


Under the explanation the following were expected: 
: ALLOWANCE 
(1) Statement of historical background y, point 
(2) Clear explanation of mass teaching versus 
l point 


individual teaching 


(3) Appreciation of the 


influence of educational 
of intelligence and achieve- 


measurements, k 
ment tests 1% points 

(4) Influence of modern child study, 1l point 
Total 4 points 


the organization of the answer is not 
he question itself, and especially one 
ell-organized answer, it may be desir- 
f the total point allowance to a rating 


f. For a question for which 
more or less indicated in t 
which explicitly calls for a w 
able to allot a small portion o 
on organization. 


Example: 
For excellent organization of the answer, allow a maximum 

of 8 points; for a reasonably good organization, allow 2 
points; for fair organization, allow 1 point; for definitely poor 
organization, allow no credit. Intermediate ratings of 2.5, 1.5 

or 0.5 may be used' where appropriate. 
Sims (7) recommends the use of a technique which involves the 
sorting of pupil papers into five groups: very superior, superior, aver- 
age, inferior, and very inferior, on the basis of a rapid reading of the 


110 Major Evaluation Techniques 


total examination. Careful reading follows in order to correct mis- 
takes in the original evaluation based on the first cursory reading. 
Letter grades are then assigned to the total examination: A for very 
superior papers, B for superior, etc. Sims feels that approximately 
10 per cent of all papers would be.rated A, 20 per cent B, 40 per cent 
C, 20 per cent D, and 10 per cent F. 

This approach deals largely with the general merit of the examina- 
tion as a whole. When more than one question appears on the total 
test, variability in pupil response from question to question makes 
a general appraisal rather difficult. The teacher, too, will find a model 
answer approach to grading single questions in essay examinations of 
greater value as a guide to pupil weaknesses. The use of a rating 
procedure, while of some value in quantifying a pupil’s mark, does 
not reduce reader unreliability to the same degree as the use of a 
model answer. 


STUDENT REACTIONS TO ESSAY TESTS 


Another important aspect of the essay examination is the mind set 
and related modes of study that students use in preparing for tests. 
Douglass and Tallmadge (2) found that students report that, in pre- 
paring for an essay examination, they reviewed and read generalities 
and trends, attempted to draw several important conclusions from ` 
tables, formulated personal opinions, and read notes on the texts and 
lectures carefully but without memorizing details. The same students 
testified that they paid more attention to minute details and tried to 
remember the exact words of the book and other specific points when 
they prepared for short-answer tests. Meyer (5), in studies of the 
examination set which students have, concluded that the essay ques- 
tion is of fundamental importance in learning and retaining sense 
material In effect, if the teacher wishes students to recall isolated 
facts, when specific cues are given, a new-type objective examination 
set may be used with profit. If the teacher wants students to recall 
material in an organized fashion and to know facts when cues are not 
given, the essay examination should be used in preference to any 
objective-type examination. It is important for the teacher to deter- 
mine what sort of reaction he wishes to test and to adapt his methods 
toward the outcome he has in mind. 


The Oral Examination 


In the past, the classroom teacher relied very heavily on the oral 
work of his pupils in order to arrive at an estimate'of the extent to 


Essay and Oral Examinations 111 


which they mastered the work of his course. The modern teacher, 
of course, does not attempt to grade every oral contribution of every 
pupil; rarely does one find a present-day classroom dominated to such 
a degree by the marking book. The re-citation of a learned lesson, 
with the opportunity it provided for grading pupils, has given way 
to patterns of classroom procedure organized in terms of experiential 
activities. This change has been accompanied by a marked reduction 
in the opportunities provided for testing and marking the oral work 


of pupils. 


. LIMITATIONS OF THE ORAL EXAMINATION 
Even at its best, however, the oral quiz serves as a very poor basis 
for determining the grade which should be assigned to a pupil's work 
in a given course. Oral examinations take too much time to ad- 
minister; like the essay test, they present the same difficulties of poor 
sampling and high subjectivity in marking; comparability of questions 
is difficult to obtain. 


VALUES OF THE ORAL EXAMINATION 

The oral questioning of an individual pupil constitutes an excellent 
means of following the thought processes which he has used in solv- 
ing certain problems, such as one in mathematics. Used in this man- 
ner, the oral quiz becomes a valuable tool for the diagnosis of pupil 
difficulties, Skillful questioning by the teacher, too, may help the 
pupil to apply known scientific information to a new situation or to 
see implications, such as those involved in adopting a given economic 
policy. The use of this Socratic-like approach should not, generally, 
be looked upon as a rod for measuring pupil achievement; rather, 
it represents a worthwhile instructional technique. 

In some situations, an oral examination constitutes the only way in 
which a measure of pupil attainment can be obtained. When children 
have not yet mastered the intricacies of reading, an oral examination 
must be used. Reading readiness tests and intelligence tests given in 
the lower elementary-school grades, for example, generally call upon, 


llow a series of oral directions in making his responses, 


the pupil to fo 
on a printed page. Such tests, of course, require standardized ad- 


ministration and scoring. 
The oral examination, use 
have the obvious advantage 0 
involved in reproducing copie 
tion of teacher-made tests O 
found. In a sense, oral presen 


d in the course of classroom work, does 
f saving the time and expense which is 
s of a test. As a result, oral administra- 
f the short-answer type is frequently 
tation of such teacher-made tests elim- 


12) -— Major Evaluation Techniques 


inates some of the difficulties which are associated with the adminis- 
tration of mimeographed or printed tests. The relative informality 
of the oral test situation is far less frightening to the timid child. The 
pupil cannot ponder over an item which he finds difficult, losing so 
much time that he cannot complete the test. Moreover, the oral 
examination places a premium upon oral rather than reading com- 
prehension. The relatively heavy emphasis which the school places 
upon the latter aspect of comprehension is thereby reduced. 


THE RELIABILITY AND VALIDITY OF ORAL EXAMINATIONS 


In general, the research studies which have been conducted in this ' 


field indicate that short-answer tests which are administered orally 
yield appreciably the same results as printed tests. Briggs and Arma- 
cost (1), Lehman (4), and Stump (16), for example, have found 
that an oral true-false examination compared very favorably with such 
tests presented in visual form. In general, the correlation between 
the same true-false test administered orally and visually differs only 
slightly from the correlation between the odd and even items of the 
test given orally. Much the same results were obtained when a five- 
alternative multiple-choice test was studied. In this instance, while 
oral presentation tended to increase the difficulty of the test slightly, 
no loss in reliability or validity was demonstrated (10). One may 
conclude, from the available evidence, that orally presented teacher- 
made short-ariswer tests yield results that are comparable to mimeo- 
graphed or printed tests of similar length and construction. 


ORAL TRADE TESTS 


One of the most recent applications of the oral examinations is seen 
in the development of the oral trade tests. Such tests are designed to 
gauge an individual's knowledge of a given trade or occupation. As 
such, they offer a ready means of screening applicants for a position 
when only a brief time is available for interviews. Oral trade tests 
were widely used by the armed forces during World War II as an 
aid in the proper classification of personnel. 

In preparing oral trade tests, a number of general principles must 
be borne in mind. The wording used in questions must be clear; the 
language chosen must be that used by the worker on the job. The 
question must relate to an activity or to a task which is an important 
element of the trade and which represents common trade practice. 
The question must be specific, and call for a specific answer; general 
questions evoke generalities as responses. Most important, the ques- 


——— HÀ  ———Á— ae 
— ann = ee T NP e sam MEL D oup qmm ages qu om mmt m m KU 


Essay and Oral Examinations 113 


tions must be able to distinguish between the various levels of trade 
knowledge. In a well-constructed oral trade test, apprentices will 
make lower scores than journeymen, who, in turn, will make lower 
scores than experts. Novices or persons working in related fields 
should show little or no knowledgé of the given trade. In general, a 
total of fifteen questions proves to be all that are needed to arrive 
at an adequate estimate of the trade background of a worker.. When ' 
these criteria are met, an oral trade test provides a valuable supple- 
ment to the other information which an employment interviewer needs 
in correctly classifying a job applicant. ; 

Sample questions from oral trade tests developed by the Occupa- 
tional Analysis Section of the United States Employment Service (3) 
will serve to make clear their general nature: 


Automobile Mechanic: 

Name the firing order on a six-cylinder motor. What clear- 
ance do you allow on crankshaft bearings with forced oiling 
systems? What is the average limit of toe-in on automobiles 
without knee action? 


Pastry Cook: 

What percentage of shortening do you use in a good pie 
crust? What ingredient do you put into biscuit dough that 
you do not put into pie dough? What do you use as leaven- 


ing in angel-food cake? 


Summary 


which still constitute a widely used approach to 
achievement testing utilized by the classroom teacher, are of greatest 
value in testing the higher mental processes, such as formulating 
hypotheses and organizing and expressing ideas. The criticisms which 
have been directed against the essay examination deal largely with 
reader unreliability, lack of objectivity, inadequacy of sampling, and 
Stress upon recall of specific information. The research evidence in 
the field, however, indicates that the essay examination can be im- 
proved. Too, it appears that students preparing for essay examina- 
tions use techniques of study which stress the development of gen- 


eral conclusions. 

The oral examination, like the ess 
room teacher, particularly as an informa 
performance. Even at its best, however, 


Essay examinations, 


ay test, is widely used by the class- 
1 device for evaluating pupil 
the oral quiz is a poor basis 


114 Major Evaluation Techniques 


for assigning pupil marks. It has value as a diagnostic tool, and in 
situations in which written tests cannot be used. The reliability and 
validity of teacher-made short-answer tests orally presented, com- 
pares favorably with that of mimeographed or printed tests of simi- 
lar length and construction. The oral trade test is of particular value 


in 


1l. 


10. 


evaluating the background of job applicants. 


Problems for Class Discussion 


(a) Prepare an essay question designed to test student understanding of 
the major concepts presented in this chapter. (b) Set up a model an- 
swer for use in grading the question, indicating the reasons for as- 
signing the values you give to the several sections of the model answer. 


- Select a trade area about which you have some working knowledge. 


Prepare an oral test for administration to a hypothetical candidate for a 
position in the trade. Justify your choice of questions and the point 
value, if any, you would give to each question. 


References Cited in This Chapter 


- Briggs, Thomas H., and Armacost, George H., “Results of an Oral True- 


False Test," Journal of Educational Research, 26:595-596, 1933. 
- Douglass, Harl R., and Tallmadge, Margaret, “How University Students 


Prepare for New Types of Examinations,” School and Societ: , 39:318- 
320, 1934. s, 


. Federal Security Agency, United States Employment Service, Oral 
Trade Questions, Volume I. Washington: United States Government 
Printing Office, 1940. 

. Lehman, Harvey C., “The Oral vs. the Mimeographed True-False,” 
School and Society, 80:470—472, 1939. ' 

- Meyer, George, “An Experimental Study of the Old and New T: f 
Examination: I. The Effect of Examination Set on Memory," Joual uf 
Educational Psychology, 25:641-661, 1934. i 

- Meyer, George, "The Choice of Questions on Essa Examinations," 
Journal of Educational Psychology, 30:161—171, 1939" poc 

. Sims, Verner M., "The Objectivity. Reliability and Validity of 

aS M E Essay 
Examination Graded by Rating,” Journal of Edu ti Re: 
DM f cational Research, 


. Sims, Verner M., “Essay Examination Questions Classified 
of Objectivity,” School and Society, 35:100—102, 1932, 

. Sims, Verner M., “The Essay Examination is a Projective Tech 

Educational and Psychological Measurement, 8:15-31, 1948, 

Sims, Verner M., and Knox, L. V., “The Reliability and Validity of 

Multiple Response Tests When Presented Orally,” Journal E po 

tional Psychology, 23:656-662, 1932. 


on the Basis 


nique,” 


Essay and Oral Examinations 115 

11. Stalnaker, John M., "The Essay Type of Examination." In E. F. Lind- 
quist, editor, Educational Measurement. Washington: American Council 
on Education, 1951. ° 

12. Stalnaker, John M., “The Validity of the University of Chicago Qualify- 
ing Examinations,” English Journal (College Edition), 23:384-388, 
1934. 

18. Stalnaker, John M., “A Study of Optional Questions on Examinations,” 
School and Society, 44:829-882, 1936. 

14. Stalnaker, John M., and Stalnaker, Ruth C., “Reliable Reading of Essay 
Tests," School Review, 42:599-605, 1984. 

15. Starch, D., and Elliot, E. C., “Reliability of Grading High School Work 
in Mathematics," School Review, 21:954—259, 1913. 

16. Stump, N. F., "Oral vs. Printed Method in the Presentation of True- 


False Examinations,” Journal of Educational Research, 18:423-424, 


1938. 
l7. Traxler, Arthur E., and Anderson, Harold E., "The Reliability of an 


Essay Test in English," School Review, 43:584—539, 1935. 


References for Further Reading 


Lindquist, E. F., editor, Educational Measurement. Washington: American 


Council on Education, 1951. 
Chapter 18, by John M. Stalnaker, presents an excellent discussion of 
the essay type of examination. 
Smith, E. R., and Tyler, R. W., Appraising and Recording Student Progress. 


New York: Harper & Brothers, 1942. 
Describes some of the instruments used in the Eight-Year Study which 


sought to measure the same objectives ordinarily tested via essay ex- 


aminations. 


Observation and 


Anecdotal Records 


CHAPTER SEVEN 


Direct observation of performance or behavior is the 
basis for collecting data for anecdotal records and observational 
techniques. Both methods of evaluation require an observer who 
records the activities, experiences, and expressions of individuals or 
groups. When these methods are restricted to selected aspects of 
behavior and when records are made systematically and as objectively 
as possible, these techniques gain in reliability and validity. 

Informal observation is employed daily by the classroom teacher 
who watches and notes such pupil performance as competence in 
oral reading or in solving an arithmetic exercise. Observation may 
be used, also, in making a running account of pupil conduct. Struc- 
tured, or controlled, observational techniques, however, require mak- 
ing an objective record of the defined activities under study. For this 
purpose, the method may use special techniques or tools such as 
specially prepared charts or checklists for recording behavior. 

Anecdotal records are cumulative notes of an individual's behavior 
observed in typical situations, activities, and experiences. The anec- 
dotal record is a concise and objective description of the behavior, 
e.g., "Sue cried when she failed to solve an arithmetic problem.” The 
interpretation of such behavior should be reserved until an adequate 
sample of such observations has been collected. The observations 
should be restricted to three or four defined characteristics and to a 
reasonable number of individuals so that the burden of observing and 
recording will not become excessive, contributing, thereby, to in- 
creased validity, reliability, and objectivity of the anecdotal records. 
` The role played by observational techniques in measurement and 
evaluation is somewhat different from that of anecdotal records. Sys- 
tematic observational study, using predetermined units and time 
samples, is in the main carried out by research investigators in psy- 
chological and sociological studies of children and adults. Anecdotal 

116 


Observation and Anecdotal Records | à 117 


records and informal observation, on the other hand, are used by the 
classroom teacher, counselor, or group worker to a greater extent than 
by professional psychologists and researchers. There is, however, no 
strict line of demarcation between the methods and their: use. The 
remarks on observational techniques are directed to teachers as well 
as to research persons. Y 


OBJECTIVES EVALUATED BY OBSERVATION 

It has been pointed out (2) that if a given objective is regarded 
as an important feature of the school program, provision should also 
be made to determine whether it is being achieved. Data on physical 
status, academic achievement, and study skills can be obtained more 
easily by evaluative instruments other than that of observation. Ob- 
jectives for whose evaluation methods of observation are most valu- 
able are generally in the areas of social adjustment or personal and 
emotional adjustment and growth. Observations on the physical and 
intellectual aspects of growth should be made by use of this tech- 
nique only if they seem to have a marked bearing on the child's per- 
Sonality. It should be noted that the observation of any objective 
requires that it be defined in terms of overt actions of individuals. 

In the area of social adjustment and growth, observations on many 
forms of behavior can be made. For instance, in what situations does 
à child play alone or with others, follow or lead? Does he fight with 
other children? The personal and emotional adjustment of the child 
has to do with his growth in personal independence, in responsibility, 
and in initiative. Emotional reactions which may be observed are 
manifold. Does the child have nervous habits? Does he strike or push 
Others? Does he exhibit frustration? Methods of direct observation, 
however, have also been used to measure study habits, interests, 


kinesthetic movements, and other diverse problems. 


Observational Techniques 


PLACE OF OBSERVATION IN EVALUATION 
Observation is a direct method of sampling behavior in social situa- 
tions, and, as such, performs a vital service as a tool of evaluation. 
Through the use of observation, an objective description of individuals 
in their actual interrelationships with each other and with their en- 
_ Vironment may be secured. By catching the natural modes of their 
actions and expressions, observational techniques permit the process 
of measurement without disturbing the normal activities of groups or 


120 Major Evaluation Techniques 


*Each time a pupil engages in an activity which falls within the 
scope of any one of the defined categories, the observer writes the 
code or symbol next to the pupil's name. During the total period 
of observation a given pupil may engage in the acts five, twenty, 
or any number of times. This frequency is a quantitative value, and 
represents the frequency of occurrence per stated unit of time. If 
a class is observed for a total period of time amounting to 200 
minutes, the score for the category of initiative for pupil A may 
be 5, for pupil B, 12, and for pupil C, 22." 

A disadvantage of predefined categories lies in the fact that once 
units have been decided upon, the observer is not free to adapt his 
account to what he observes; rather, he must fit what he observes to 
his categories. This sometimes means that he is compelled to project 
his own definitions on the behavior that he observes (4). Categories 
based on exploratory study and refined in terms of such study par- 
tially meet this criticism. f 

For many purposes, however, it is important to consider items of 
behavior in terms of their context and the patterns of which they are a 
part. This relatedness of acts is unique for each situation and thus is 
not accessible to previous categorization. This requires a second 
method of recording observations, the "running account" method 
which enables the investigator to capture the context in which units 
of behavior occur. The observer using this method gives a fluid, run- 
ning account. He records as fast as he can and as much as he can 
hear and see which he feels is relevant to the problem under study. 
This latter consideration is important. Running account observations 
are not planless observations. There must be understanding, before 
the observer begins to take records, regarding the points of emphasis 
in the observation. 

Measurement of observer reliability becomes more difficult when 
this method is used. However, comparisons can be made between the 
content of the records of independent observers, even though a deci- 
Sion as to the interpretation of a given item of behavior has not yet 
been made by the observer. This procedure serves to make explicit 
to the observer the forms of behavior he is likely to emphasize in his 
observations, Many studies combine checklists, containing predeter- 
mined categories, with space for supplementary running accounts. 


LENGTH OF OBSERVATION PERIODS 


Single observation periods may extend anywhere from several min- 
utes to several hours. Total observation of behavior in studies of 
growth and development has been carried on for months and years 
(8). Two factors determine the length of the observation period: the 


Observation and Anecdotal Records 121 


nature of the phenomenon under study, and feasibility. There is also 
a limitation which is imposed upon the duration of observation by 
the division of school activity into class periods. Jersild and Meigs (4) 
advise a series of many short, rotated periods, ranging from five to 
fifteen minutes, when studying the behavior of young children, espe- 
cially in “free” situations. 

The use of shorter observation periods may mean a loss in sequence 
of observation, due to the chain of stimulus-response-stimulus behav- 
ior which exists in social situations. Activities may be initiated before 
the onset of an observation period, and behavior manifested during 
that period may be a part of a sequence of chain behavior reactions 
released by these activities. In shorter observation periods, important 
relationships may be missed or misunderstood by the observer. 

If frequencies of specific behavior units are to be scored, it is nec- 
essary to divide the observation period into time units of observation. 
Whether or not such refined time units are needed depends entirely 
on the nature of the behavior studied and the degree of quantification 
desired. Time units ranging from five seconds to several minutes have 
been used. How often a given behavior is likely to occur and how 
best to discriminate between behavior units will enter into a decision 
regarding the scheduling of time units within observation periods. 
Similarly, a decision to time observations may be made if comparisons 
between observers are indicated for the refinement of categories or 
for a check of reliability, or if two observers are to serve supplemen- 


tary functions in the recording of data. 


POSITION AND EFFECT OF OBSERVER 

The function of the observer makes it necessary for him to be in a 
favorable position for observation. It is advisable for him to take a 
position that enables him to see or hear the behavior which he intends 
to observe. But, as Symonds has remarked, much that passes for ob- 
servation is obtained only from fleeting glimpses, with obstructions to 
the view, and in the face of various distractions (5). The observer 
should be able to locate readily the object of his observation. A device 
available in classroom observation is the seating plan in which each 
child's name is inserted. In order to capture certain reactions it may 
be necessary for the observer to change position or to move rapidly. 
It is often not possible for the observer to assume a position’ most use- 
ful to him in observation. The better position may interfere with the 
natural development of the social situation under study. Thus any 
consideration of observer position is related to considerations of the 
effect of the observer upon the situation. Symonds writes: 


122 Major Evaluation Techniques 


“On the one hand there are those who would remove the examiner 
entirely from the situation. If he is present, it is agreed he should 
be a passive and unobtrusive visitor, but it would be better still if 
he could be invisible. The Yale Psycho-Clinic has devised a screen 
behind which the observer becomes invisible to children in the 
room, and is therefore eliminated as a stimulating factor in the situ- 
ation. The writer became skillful in observing the study habits of 
boys through a technique of concentrating on a certain boy until 
the boy looked at him, whereupon he would turn his attention else- 
where. It is possible to disarm suspicion by apparently being un- 
aware of what is going on. At the other extreme is the observer 
who actually takes an active part in the situation by asking ques- 
tions, making suggestions, teaching and the like. The part an ob- 
server takes in the situation depends on the particular problem 
under investigation." (5) 


It has been found that the presence of an observer distorts the be- 
havior under study less than anticipated. Very young children tend to 
be less conscious of the presence of an observer than are older chil- 
dren or adults. The effect of the observer's presence, however, seems 
to vary with time. As the group observed gets to consider the observer 
as a part of their environment, the natural reaction and interactions 
between individuals resume and valid data can be obtained. The need 
for one-way vision screens would seem limited to situations where the 


investigators predict adverse effects from the introduction of an ob- 
server. 


CAUTIONS TO THE OBSERVER 


Investigators are cautioned to keep the following points in mind 
when employing the method of observation (1). 


a. The significance of observations depends upon the ability, 
understanding, and characteristics of the observer. 


b. The observer needs to be conscious of the danger of mis- 


interpretation through the confusion of symptoms with un- 
derlying causes. 


c. Recordings of observations should be made promptly, so 
that none of the important details will be forgotten. 
d. Generalizations from observation should be arrived at 


only after careful study. Such generalizations should be 
held to a minimum. 


INTERPRETING THE DATA 


Time spent in obtaining the data may be short in comparison with 
the time that subsequently must be spent in analyzing the data. Less 
time will be spent in analysis if the observations were made simply 


Observation and Anecdotal Records 123 


to compare frequencies of specific activities rather than if they were 
made to gain insight into fundamental behavior processes. In the lat- 
ter event, the records will be interpreted in the light of other perti- 
nent knowledge which the investigator may possess. If he is studying 
aggressive behavior, before interpreting the records gathered by ob- 
servation, he will read related studies and base his interpretations on 
a framework of knowledge and theories developed by other workers 
in this area. 


ADVANTAGES OF DATA FROM OBSERVATION 


Some of the advantages of observational techniques for certain areas 
of evaluation are the following: In a relatively "free" situation—de- 
fined in terms of absence of outside pressure—observation may reveal 
important aspects of the personality and behavior of individuals stud- 
ied (2). The individual does not feel “on the spot.” He is not singled 
out for questioning or testing. He engages in his usual activities and 
interests with others spontaneously. As the child, for instance, meets 
frustrating or satisfying situations we observe his natural modes of re- 
action and response. The records gathered can be systematically 


treated and are subject to comparison. 


LIMITATIONS OF DATA FROM OBSERVATION 

Observation as a technique requires a considerable amount of skill. 
Observers must be conscious of the difference between describing be- 
havior and evaluating behavior. The personality of the observer be- 
comes an additional variable. The experiences, biases, and values of 
the observer cannot always be completely separated from the behav- 
ior he is recording. Similar behavior manifested by different individ- 
uals may not have the same meaning to different observers. Observa- 
tion, of course, does not give the same insight into individual personal- 
ity structure as do projective techniques. In addition, observational 


techniques are time consuming. 


Anecdotal Records 


DEFINITION OF ANECDOTAL RECORDS 

Traxler (6) reports several definitions of anecdotal records which 
appear in the literature. Anecdotal records have been defined as *re- 
ports of significant episodes in the life of students," as "simple state- 
ments of incidents deemed by the observer to be significant with re- 
Spect to a given pupil,” and as "descriptions of actual behavior taking 


124 Major Evaluation Techniques 


place in situations noted by the instructor," in contrast with rating 
scales which provide records only of the summary interpretation of 
the behavior observed. 

In general, anecdotal records are a series of notes on exactly what 
a child said or did in concrete situations. As successive observations 
accumulate, the records contain a variety and continuity of evidence 
which yield a picture of the child's behavior patterns and growth, his 
interests and attitudes, his strengths and weaknesses, and problems. 
These records are not to be confused with case studies, which contain 
more extensive data, including developmental and family histories. 
Anecdotal records are reports of current observations of specific inci- 
dents which illustrate the child's reactions. Such observations are en- 
tered on these records frequently enough to give an adequate picture 
of the child's growth. 

A few entries from the record of Bob, an eleven-year-old boy (7), 
will illustrate: 


2/8/55 Before class period when children were all talking 
in small groups, B. stood alone. 


2/14/55 Did not take part in discussion on what we would 
look for at the museum. Sat quietly. 


2/15/55 Wrote about trip to museum but crumpled up his 
paper and threw it into basket. 


2/17/55 Only child who did not volunteer to do something 
for the Mexican play which the children decided 
to produce. 


2/23/55 In the yard at lunch time, asked B. what he liked 
to do best outside of school. He replied, “Play the 
violin. My mother is going to help me become a 
violinist.” Said he had no friends because his 
mother does not allow him to play with the boys 
on the street. Asked him if he would play his vio- 
lin at assembly. He accepted with a smile, agreed 
to rehearsal in class tomorrow. 


2/24/55 Brought his violin and played well. Children ap- 
plauded and congratulated him. He smiled and of- 
fered to play any time we wished. Then he said 
he knew some Mexican songs for the play. (This 
is the first time he has volunteered anything. ) 


Though anecdotal records are in some respects similar to other ob- 
servational techniques, such as time sampling, they differ from them 
. * 


Observation and Anecdotal Records 125 


in method. Other observational techniques are characterized by a sys- 
tematic recording of previously defined units of behavior described in 
terms of activity over definite time intervals. Anecdotal records are 
less carefully controlled than techniques utilizing time sampling. They 
approximate the "running account" method of observation though 
there still is an important distinction between the two. The requisite 
. for the successful use of anecdotal procedures is that the anecdotes be 
written freely about all kinds of social situations where significant 
items of behavior are observed. 


ANECDOTAL RECORDS HELP TEACHERS UNDERSTAND CHILDREN 


Anecdotal records help teachers understand and guide children. 
These records direct the observation of the teacher to the child's needs 
and so help in guiding his development. From his observations and 
his study of the records, the teacher gains greater understanding of 
the child's personality and his needs. This knowledge affects the 
teacher's procedures and attitudes and results in a relationship with 
the child which fosters his growth. As the teacher observes and learns 
to know one child well, he learns more about child behavior in gen- 
eral and is then better able to establish the understanding atmosphere 
in which children can best develop healthy personalities. 

Anecdotal records are of value not only to the teacher who makes 
them, but also to subsequent teachers and to other staff or agency 
Workers who are concerned with the child's development and adjust- 
ment. If these persons have this information available, they do not 
have to begin from scratch in accumulating data about a child, and, 
in dealing with him, they are able to avoid unnecessary mistakes 
caused by a lack of knowledge. Thus, the records make possible more 
effective guidance of the child's development throughout his school 
career. 

The actual observations may seem much more important than the 
Written records, but the record is also essential as concrete evidence 
both for the person making the notations and for other school person- 
nel interested in a particular child's adjustment. A systematic record 
to which one can refer for an accurate account of the child's behavior 
is certainly preferable to memory. Almost any elementary psychology 
textbook will cite experiméntal evidence of how much people forget. 
Not only do we forget, but we tend to forget that which does not 
agree with our previous impressions and to remember that which con- 
firms our ideas and attitudes. Consequently, without records interpre- 
tations of a child's behavior are likely to be impressionistic and sub- 


jective. 


126 Major Evaluation Techniques 


OBJECTIVES TO BE EVALUATED BY ANECDOTAL RECORDS 


If the teacher is not to become swamped by a mass of unrelated 
detailed observations, he must know where to focus his observations. 
It is desirable to concentrate on a few major aspects of child growth 
for which information cannot be obtained except by observation. Ade- 
quate data on physical status, academic achievement, and study skills 
can be obtained from tests, rating scales, and checklists using pre- . 
determined categories. Consequently, observations in these areas of 
physical and intellectual growth should be recorded only if they have 
a marked influence on the child's personality. Objectives for whose 
evaluation anecdotal records are most valuable are those in the areas 
of social adjustment and growth, personal and emotional adjustment 
and growth, as well as other factors which may have a major influence 
on a particular child's adjustment. 


Social Adjustment and Growth 


By social adjustment of the child is meant his relationships with 
other children and with adults. The school is committed to an im- 
provement of inter-personal relations and to increasing the social ad- 
justment of individual children. Anecdotal records can play a great 
part in helping the teacher and the outsider evaluate progress made 
in these directions. 

The records include incidents that illustrate the child's relations 
with other children—e.g., incidents that show whether and in what 
situations he plays alone or with others, with whom he plays, whether 
he works well with a group, leaves a group if he cannot get his own 
way, or fights with other children. Also included are entries illustrating’ 
how well he gets along with his teacher and other adults—e.g., 
whether he requires much attention or affection from adults; adjusts 


well with adults, but poorly with children; is shy and fearful of adults, 
or is antagonistic to them. 


Personal and Emotional Adjustment and Growth 


Attempts undertaken in promoting personal and emotional adjust- 
ment and growth are similarly subject to evaluation and clarification 
through the use of anecdotal records. Incidents illustrating the child's 
independence or dependence, his manner of meeting new situations, 
his assumption of responsibilities, and his demonstration of interests 
are included under this heading. Incidents illustrating his pattern of 
emotional reactions, showing whether he is cheerful, laughs readily, 
cries easily, worries, is fearful, is enthusiastic, is easily discouraged, 


Observation and Anecdotal Records 127 


has temper tantrums, or shows nervous mannerisms all should be re- 
ported. Sometimes physical manifestations like vomiting, headaches, 
and indigestion may be symptoms of emotional difficulties. 


Other Factors 

Since the school is concerned with the total adjustment of the child, 
anecdotal records may be kept dealing with any other factor which 
has a major influence on a particular child's adjustment. 

These factors will vary for different children. An unusual event in 
a particular child's life may have a marked effect upon his adjustment 
—e.g., birth of a new baby, illness of a parent or of the child himself, 
Change in the economic status of the family, separation of parents, 
placement in a foster home, or moving to a new community. With 
another child it may be some constant home factors—e.g., home stand- 
ards in conflict with neighborhood standards, quarreling parents, or 
Working mother. With still another it may be the discovery of a spe- 
cial talent, a strong interest, or a special disability. 


NUMBER OF CHILDREN OBSERVED 

Records for just a few children are preferable at first. Since consid- 
erable practice is required to develop the techniques for recording and 
interpreting anecdotal records, and since a teacher's first efforts will 
necessarily include a period of experimentation and floundering with 
à new type of record, the most effective way to begin is by studying 
à few children only. It is suggested that, at first, no more than two 
Children be observed. If the teacher who is just beginning to keep 
records studies a few children only, he has an opportunity to work 
out his techniques as well as to learn about the children's needs. If, 
instead, he attempts to keep records for the whole class, he will be- 
come swamped with the process and, as a result, the reports may 
become a chore that will not meet the needs of the individual child. 

Experience in keeping cumulative records is necessary to develop 
techniques. To learn to use cumulative anecdotal records most ef- 
fectively in obtaining a continuous picture of the child’s growth, it is 
necessary to keep records of behavior over a period of time. As the 
teacher studies the child and makes his ‘notes, his techniques of ob- 
Servation, recording, and interpretation will improve. He will become 
more expert in selecting events which are significant for the child and 
in writing objective reports of these incidents. His records will pre- 
Sent an increasingly accurate picture of the quality of the child’s be- 
havior. Most teachers will probably need at least a year or more of 


128 Major Evaluation Techniques 


experience in writing and interpreting anecdotal records for a few 
children before they have mastered the techniques. 


WHERE CHILDREN ARE OBSERVED 


Observations for anecdotal records should be made in a variety of 
situations as children show different behavior under different circum- 
stances. To obtain a fair sampling of the child's reactions, he should 
be observed incidentally in a variety of activities and situations: in 
painting or other aesthetic expression, free play, athletic games, dra- 
matic improvisation, discussion, and other learning situations; on trips 
and in the classroom, the halls, the assembly, the lunchroom, and the 
yard. If the record is really to reflect the child's individuality, most ob- 
servations should be made in situations in which he has freedom to 
display a variety of reactions. For an adequate picture of inter-per- 
sonal relations, it is necessary to observe the child in flexible situa- 
tions such as those just mentioned, wherever the child has an oppor- 
tunity to choose his companions and to have social relations. 

Information about the child's reactions at home is often vital in in- 
terpreting the child's behavior and should be obtained whenever pos- 
sible by means of informal interviews with the parents. Often teachers 
have used the opportunity to interview a parent when she has come 
to school to attend a parent-teacher meeting or a class tea, or when 
she has come to call for a younger child. Ideally, there should be a 
regular time set aside for conferences with mothers and fathers. 


CRITERIA FOR RECORDING ANECDOTES 


Certain uniform standards should be observed in recording anec- 


dotal material. The following list includes some of the essentials for 
any anecdotal record keeping. 


1. Each entry should be dated, so that the sequence and lapse of time 
are clear when the record is reviewed for evidence of the child's 
development and growth. Incidents should be recorded on the day 
on which they occur before memory of them becomes distorted. 

2. Each entry should contain some statement of the situation in which - 
the incident occurred, so that it can be properly interpreted. A 
child's shouting excitedly during a ball game and his shouting in the 


midst of an arithmetic test would certainly be interpreted differ- 
ently. 


Examples. 


4/15/54 During class discussion 
4/22/54 While painting in class 


Observation and Anecdotal Records 129 


8. Each anecdotal entry should be a brief factual description of an in- 
cident complete enough so that it can be understood later, when an 
attempt is made to evaluate the child's behavior. _ 


Examples: Entry for a first-grade girl, Arlene: 


10/3/54 Came in crying this morning, "My dear, dear dog- 
gie was at the kennels and my daddy told me that 
he is not coming home again." | E 


4. Entries should be objective reporting of facts insofar as possible. A 
generalized statement or tentative interpretation may be necessary 
to make the picture clearer, but should always be accompanied by 
reports of specific incidents and should be based upon adequate 
facts. Interpretations and generalizations should be placed in paren- 
theses to differentiate them from factual data. 


Traxler points out that many teachers obscure the report of what 
they observe with subjective statements of opinion concerning inter- 
pretation and treatment (6). He gives the following example of 
such a “mixed” anecdote. 


"In a meeting of her club today, Alice showed her jealousy of the 
new president by firing questions at her whenever there was an 
opportunity. She tried to create difficulties by constant interrup- 
tions throughout the period. The other students showed their re- 
sentment by calling for her to sit down. It is apparent that she is a 
natural trouble-maker, and I think her counselor should have her 


in for a serious talk.” 

Traxler explains that the phrases “showed her jealousy,” “showed 
their resentment,” and others are value statements and do not rep- 
resent an objective description of what happened. 

He objectifies the report in this manner: * 


Incident: A 
f her club today, Alice fired questions at the 
at every opportunity. She interrupted many 
d. On several occasions the other stu- 


In a meeting o 
new president 
times during the perio 
dents called for her to sit down. 


Interpretation of the incident may then be added to the objective 


description, to quote: 


Interpretation: 
Alice seemed to be jealous of the new president and desirous 
of creating difficulties. The other students appeared to resent 
her actions. The girl seems to enjoy making trouble for others. 


130 Major Evaluation Techniques 


Sometimes observations which are fused with evaluative com- 
ments display good insights into the behavior observed. These in- 
sights would stand out more clearly, however, if they were sepa- 
rated from the descriptive material. 

5. Entries of incidents showing desirable, passive, inconspicuous, or 
non-participating behavior are as important in giving a true picture 
of the child as incidents of undesirable or dramatic behavior. 

6. Enter information about home attitudes when it is obtained, since 
it may contribute clues to the interpretation of the child's behavior 
which no amount of school observation will yield. 

7. Have an adequate number and sequence of anecdotal entries upon 
which to base judgments and interpretation of behavior. For some 
children more entries are necessary than for others. A teacher just 
beginning to keep records should make only about one entry a 


week for each chosen child, more if necessary. With experience, 
fewer entries may be found sufficient. 


COMMON ERRORS IN RECORDING WHICH SHOULD BE AVOIDED 


. 1. A common error of beginners is to give generalized descriptions or - 
evaluations of behavior rather than specific incidents, 


Examples of generalizations: 


Chatters all the time. 
Is interested in art. 


Examples of evaluations: 


Is as lazy as two donkeys. 
Is very sensitive—a good child. 


2. Another common error is to give the teacher’s personal reaction to 


the child rather than the child’s behavior. 
Examples: 


Isn’t an interesting or colorful subject to report on. Gets in 
your hair. Shows no interest. Wish I could see what's behind 
the Iron Curtain she's set up. : 


This entry shows the teacher’s reaction but gives us little informa- 
tion about the child's actual behavior. 


A sweet, charming child. 


This again indicates the teacher's reaction rather than the child's 
specific behavior. 


Bu 


——— 


Observation and Anecdotal Records 131 


8. 


Beginners tend to interpret behavior before there are adequate 
data and to confuse such interpretations with facts. 


Recording primarily negative or dramatic incidents is a tendency 
that must be guarded against. The record must not be allowed to 
become a report of a child's misdeeds and failure to conform. It 
should be a balanced, not a one-sided picture of the child. 


There is no need to worry if a significant item is omitted. It is im- 
possible and unnecessary to record everything. The record con- 
tains only samples of the child's behavior. If the behavior that has 
not been recorded is really important in the child's adjustment and 
growth, the same pattern of behavior will be repeated and there 
will be later opportunities to record it. If that pattern is not re- 
peated, the original incident was probably not important. 


METHODS OF RECORDING ANECDOTES 


Anecdotal records may be kept on simple forms. The form for the 


original record may be placed on a small card (3” by 5", or 5" by 8"), 
a half sheet, a full sheet, or a page that will fit into a loose-leaf note- 
book. The size of the form will determine the number of anecdotes 
which can be written on each. A sample form is given here. 


Anecdotal Record 


Pupil Class. 


Incident Comment 
Interpretation 


Observer. 


ricunE 2 Sample Anecdotal Record Form 


132 Major Evaluation Techniques 


The advantages of using small cards is that a few cards may be 
carried in one's pocket and brief notes may be made when there is oc- 
casion to do so. Original records for each child will be filed to- 
gether in chronological order to provide a cumulative picture of the 
child for later review. 


INTERPRETATION OF CUMULATIVE ANECDOTAL RECORDS 


Teachers often hesitate to summarize and interpret anecdotal rec- 
ords. In his daily work, the teacher is constantly called upon to make 
tentative interpretations upon which to base his decisions for handling 
children and he does so with assurance. However, when he is ready 
to interpret anecdotal records he becomes timid and doubtful of his 
ability, feels that the interpretation should be an elaborate psycholog- 
ical diagnosis to be made by experts, True, experts should be called 
in to help in interpreting some cases, but for most children the 
teacher can evaluate the record himself. His intimate contact with 
so many children of a particular age gives him a rich background of 
comparative data to help him evaluate an individual child's behavior. 
Naturally, the broader the teachers background in child study and 
psychology is, the more accurate and comprehensive will be his in- 
terpretations. As he gains experience in evaluating records, he will 
find it easier to make more intensive interpretations. It should be em- 
phasized, however, that no elaborate case study or diagnosis can be 
made solely on the basis of anecdotal records. 

The following steps are involved in interpreting the records. 


m 


. The notes are reviewed periodically every month or two, to see 
what trends in behavior are revealed, what suggestions for handling 
the child are implied, and to decide what further information 
should be obtained. 

2. The facts are examined to see if there is sufficient information upon 
which to base conclusions for each aspect of growth. If the data 
are inadequate, further observations should be made and addi- 
tional information sought. Conclusions should be drawn only on 
the basis of repeated instances. 

8. At the end of each half year, the teacher writes a summary and 

interpretation of each record. This report is not a case history, but 

should help in planning for the child's social and emotional growth 
and adjustment. The report should include a summary of repeated 
patterns of behavior and evidence of growth or regression shown in 
the record. If there is an adequate series of dated and classified 


Observation and Anecdotal Records 133 
facts, these major trends will stand out. If the data in any area 
are not sufficient to show a clear picture, this fact should be stated. 


On the basis of the summary, the teacher makes a tentative inter- 
pretation of the child's motivations and of his major problems and 
needs. This interpretation is based upon the teachers knowledge of 
the particular child and of scientific principles of behavior. The child's 
behavior is interpreted in the light of his level of maturity, his home 
relationships, the economic and cultural level of his family and neigh- 
borhood, and other related factors. As further work is done with the 
child, the interpretation may have to be changed. It is only an hy- 
pothesis. It is the best that can be done with the available informa- 
tion, but that information is always just a small sample of the child's 
total behavior. Only by seeing the outcome of attempts to help the 
child, based on these interpretations, is it possible to determine 
whether the evaluations are valid. Increasing confidence can be placed 
in these interpretations as a number of successive teachers give con- 


sistent reports over a period of years. 


Summary 


Direct observation and recording of behavior is the basis for col- 
lecting data by means of observational techniques and anecdotal rec- 
ords, Observational techniques include the structured, or controlled, 


recording of observations by using defined categories of activities in 


a running account of pupil conduct. Anecdotal records are cumulative 
notes of an individuals behavior observed in typical situations. 


The types of social situations for which observations are made in- 
clude nursery school, classrooms, factory, office, discussion group, 
home, and community activities. These may be free or natural situa- 
tions, manipulated situations, or partially controlled situations. The 
objectives of observation usually emphasize social. growth and adjust- 
ment, for example: solitary play, cooperative activity, personal inde- 
pendence, nervous habits, study habits, interests, kinesthetic move- 
ments, and other diverse activities. 

Two major methods of recording observations are used. In one, the 
units of behavior to be observed are defined in advance and a record 
is made of those activities only which fit the defined categories. In 
the other, a running account of observations is made without the bene- 
fit of predefined categories. The first method is adapted to formal 
studies but the second method is better fitted to informal situations. 
Such factors as the length of the ‘observation period, and the position 


134 Major Evaluation Techniques 


or effect of the observer are directly related to the purpose of a par- 
ticular evaluation. Length of each observation must be determined 
by activity and objective, based upon an analysis of the situation. 
This may vary from one minute to fifteen or thirty minutes for each 
one of a series of observations. Generally, the presence of an observer 
distorts the behavior under study less than is anticipated. 

Observations and their validity depend upon the ability, under- 
standing, and characteristics of the observer. Recording of observa- 
tions should be made promptly and objectively. The observer should 
be careful to discriminate between symptoms and underlying causes. 
Generalizations from observations should be cautiously made after 
extensive samples of activity have been recorded. 

Anecdotal records are a series of notes on exactly what a child did 
or said in a specific situation. Objectives for which cumulative anec- 
dotal records are most valuable include social adjustment and growth, 
personal and emotional adjustment, and related factors. 

It is recommended that anecdotal records for only a few children 
be kept at first. As a teacher or observer gains experience, the number 
of children for whom anecdotal records are recorded may be increased 
as judgment indicates. The situations in which children's activities are 
recorded should emphasize flexible relationships on the playground, 
in the lunchroom, in committee work, in a play or club period, or 
wherever the child has a maximum of choice in expressing personal 
and social relationships with others. 

Minimum standards for recording anecdotal records include: date 
of each entry, statement of situation in which incident occurred, a 
factual description of the incident, objective reporting of any related 
information, and an adequate number and sequence of anecdotal rec- 
ords upon which to base a judginent. Common errors to.be avoided 
are: generalized evaluation of behavior rather than description of 
specific incident, observer’s reactions but no objective facts on inci- 
dent, recording of negative data only, generalizing before sufficient 
data are collected, and concern that à pattern of behavior will not 
occur after a previous observation. The method of recording should 
be kept as simple as possible. 

So long as an adequate sample of observations is obtained, inter- 
pretation of the cumulative records is possible, For adequate interpre- 
tation of records, it is recommended that the notes be reviewed 
periodically to check trends in behavior, to determine if data are in- 
adequate in any areas, and to prepare a Summary and interpretation 
of the record to help in planning for the child's social and emotional 
growth and development. 


Observation and Anecdotal Records  : 135 


Problems for Class Discussion 


l. Make a “running account" of the physical movements of each of ten 
children, allowing three minutes of observation for each child. Examine 
your "account" and organize the physical movements into tentative pre- 
defined categories for checklist observation. 

2. Make anecdotal records of the nervous habits or mannerisms of three 
children over a period of a week. Keep your entries specific and ob- 
jective. Interpret your records for each of the three children to make 
tentative generalizations about their nervous behavior. 


References Cited in This Chapter 


1. Division of Research and Guidance, Los Angeles County, Guidance Hand- 
book for Secondary Schools. Los Angeles: California Test Bureau, 1948. 

2. Gates, A. I., et al., Educational Psychology. New York: Macmillan Com- 
pany, 1948. 

8. Goodenough, F. L., "Measuring Behavior Traits by Means of Repeated 
Short Samples,” Journal of Juvenile Research, 12:230-235, September- 
December, 1928. 

4. Jersild, A. T., and Meigs, M. F., “Direct Observation as a Research 
Method,” Review of Educational Research, 9:472-482, December, 1939. 

5. Symonds, P. M., Diagnosing Personality and Conduct. New York: Ap- 
pleton-Century-Crofts, Inc., 1931. 

6. Traxler, A. E., The Nature and Use of Anecdotal Records. New York: 
Educational Records Bureau (Revised), 1949. 

7. Wrightstone, J. W., and Krugman, J. I., A Guide to the Use of Anecdotal 
Records, Educational Research Bulletin, No. 11. New York: Board of 
Education of the City of New York, May, 1949. 

8. Wrightstone, J. W., “Constructing an Observational Technique,” Teach- 
ers College Record, 37:1-9, October, 1935. 

9. Wrightstone, J. W., “Measuring Teacher Conduct of Class Discussion,” 
Elementary School Journal, 84:454—460, February, 1934. 


References for Further Reading 


Biber, B., et al., Child Life in School, p. 88-58. New York: Dutton, 1942. 
The section on “Recording Spontaneous Behaviors” provides examples 
of the use of observational techniques in actual school situations. 


Hamalainen, A. E., An Appraisal of Anecdotal Records, Teachers College 
Contributions to Education, No. 891. New York: Bureau of Publications, 


Teachers College, Columbia University, 1943. 

The author reviews the advantages as well as the difficulties of the 
introduction and use of anecdotal records in the typical public school. 
Jarvie, L. L., and Ellingson, M., Handbook of the Anecdotal Behavior Jour- 

nal. Chicago: University of Chicago Press, 1940. 
Methods of recording, collating, and interpreting anecdotal records are 


discussed by two pioneer workers. 


Questionnaires, Inventories, 
and Interviews 


CHAPTER EIGHT 


Questionnaires, inventories, and interviews are similar 
techniques for gathering data by securing answers to questions, On 
the questionnaire, the respondent writes answers to a limited number 
of questions. The questions may refer to matters of fact or matters of 
opinion. On the inventory, one may write ( 
etc.) short responses to a rather complete set 
terview, one communicates verbally 
questioner or counselor. There is a rather arbitrary division between 
the questionnaire and the inventory; however, there is no mistaking 
the differences between the questionnaire or inventory and the inter- 
view when one considers format, situation, and purpose. When the 
questionnaire, inventory, or interview is used for an evaluative or di- 
agnostic purpose, it should have a reasonable and appropriate degree 


of reliability, validity, and objectivity, and should fit the practical 
situation. 


Some confusion ma 
all questionnaires, or 
As a matter of fact 


» 


or encircle “yes,” “no, 
of questions. In the in- 
and directly, face to face with the 


, there is no absolute or set pattern to which all 


136 


Questionnaires, Inventories, and Interviews 137 
The Questionnaire 


CONSTRUCTING THE QUESTIONNAIRE 

The impression of many people is that a questionnaire is a very 
simple instrument which can be constructed quickly. Such individuals 
usually mimeograph a list of questions, with a space after each one 
where people can write their responses. They give little thought to 
the many factors which make the difference between a poor and a 


good questionnaire. 
For practical purposes, the individual who desires to construct a 


questionnaire should observe the following points: 
a. Use the questionnaire technique when it it is most appro- 
priate. É 
b. Define general purposes and specific aims. 
c. Construct appropriate questions. 
d. Arrange questions in appropriate groupings. 
e. Design the format with appeal. 
. Check the questionnaire for adequacy. 
What factors should be considered under each of these steps? 


^ 


Use the Questionnaire When Appropriate There are a number of 
conditions under which the questionnaire is appropriate. A question- 
to the teacher when answers to a series of impor- 
om a number of students within a limited 
group of students returned from a two- 


naire may be useful 
tant questions are needed fr 


time. For instance, after a 
month field trip as part of a regular sociology class, the evaluator ad- 


ministered a questionnaire in order to secure answers to many ques- 


tions about aspects of the trip which could not be gathered in the 
time available through interviews. The classroom teacher may be 
faced with a similar situation at the beginning or end of a term, when 
student reactions prior to or subsequent to school experiences are de- 
sired for evaluation purposes. Here, too, the time element is impor- 
tant. When there is insufficient time to interview or to contact each 
person or group of persons personally, the questionnaire may be a 
time-saver. Valuable class minutes can be saved by utilizing a ques- 
tionnaire which can be filled out at home or outside of class. This is 
not an unmixed blessing, for some students may forget to bring their 
completed questionnaires to class. 

The questionnaire is feasible when it is not possible to reach each 


person directly by telephone, or by personal visit. In such cases, indi- 


138 Major Evaluation Techniques 


viduals can be reached through questionnaires by mail or in class or 
interest groups in the school. In a class group, a teacher who wants 
the reactions of all students to a few questions must guard against 
suggestion and collaboration. If the teacher asks for a show of hands 
in class voting or polling, there is a tendency for students to watch 
each other and vote with the "crowd" or to follow certain "leaders. 
A major value of a written questionnaire in a class situation is the 
privacy accorded each response. 

Questionnaires are appropriate also for securing data which are not 
readily available or not conveniently assembled. For instance, while 
data on high-school students may be available in personal records or 
cumulative files in the main office, a subject-matter teacher cannot 
conveniently obtain the specific data he desires and have such data 
at hand for reference purposes. Also, there are times when data on 
any given group of individuals are filed in four or five different places. 
For convenience, it may be advisable for the teacher to develop ques- 
tionnaires for his student groups on the specific material he needs. 

A pencil-and-paper technique such as the questionnaire is generally 
best suited for collecting impersonal or general data about the stu- 
dent. Many times the student hesitates to record on paper certain 
“unpopular” personal data. However, it should not be stated with ab- 
solute certainty that because a matter is very personal, the interview 
rather than the questionnaire should be employed. It may be that 
some people, ashamed of speaking face to face about personal mat- 
ters, are happier to record on paper what otherwise would be difficult 
to discuss. As in so many cases or situations, the judgment of the 
teacher is significant in determining what technique is most appro- 
priate to the material, the person, and the situation. 

The questionnaire is frequently employed as a mean 
persons or groups. For example, the teacher may desire to compare 
the responses of a class of boys with those of a class of girls on the 
same grade level on an interest questionnaire. The fact that both 
groups have been asked the same questions, in the same order, and 
, 8t the same time, permits the teacher to draw appropriate conclusions. 

In summary, then, the questionnaire may be used when 


a. The group may soon break up because it is temporary. 
b. The group is together for the first or last time. 

c. There is insufficient time for individual interviews. 

d. There are too many to be interviewed. 


€. There are too many people who can't be reached per- 
sonally. 


s of comparing 


Questionnaires, Inventories, and Interviews 139 


£. An independent response from each person is desired. 
g. Desired data are either non-existent or not conveniently 


available. 
h. Answers to a comparable set of questions are desired. 


Define Purposes and Aims One of the main weaknesses in question- 
naire construction is the lack of clearly stated purposes and aims. 
Many individuals have a blind faith that, somehow, pulling together 
the answers to a number of questions will reveal valuable informa- 
tion. This is a misplaced faith in the process of “what comes natu- 
rally." Unless one formulates clear purposes, the process of construct- 
ing questions for the questionnaire becomes blind and inefficient. The 
teacher may find, after the responses are tabulated, that he forgot to 
include questions whose answers are essential or that much of the 
data collected was irrelevant, tedious to tabulate, or a sheer waste of 
valuable time. The purposes of the questionnaire become the criteria, 
therefore, for the inclusion or exclusion of any question. 

One of the most important reasons for administering a questionnaire 
is to secure background data on an individual (or group) which may 
be valuable in accounting for classroom behavior (6). To know that a 
boy is an orphan who is working part-time every day after school may 
illuminate the reasons why the boy is so reserved and serious in class. 
Also, characteristics of a whole class may account for the level and 
type of work of the group. For instance, the fact that all the students 
reside in a well-to-do suburban area may account for their musical 
and artistic accomplishments and interests. The first important reason 
for using a questionnaire, therefore, is to gather data which bear di- 
rectly or indirectly on the educational process. 

The questionnaire may also be used to secure a pencil-and-paper aid 
in evaluating the extent to which educational objectives are being 
realized, An objective like “encouraging students to attend symphony 
concerts” may be said to have been achieved if the student who has 
never attended such a concert buys a subscription to a series or goes 
to one or two concerts. A questionnaire may contain questions, there- 
fore, which will permit the student to reveal what new interests, skills, 
appreciations, knowledge, and attitudes he is developing as a result 
of classroom stimulation. Questions may be designed to secure either 
quantitative or qualitative data pertaining to the achievement of stu- 
dent, teacher, or class objectives. 2 

A third purpose in using a questionnaire is to secure data which 
will be pertinent to planning the curriculum. A student teacher in one 


140 


high school included the following questions in her physical and 
health education questionnaire in order to help her plan her work in 


Major Evaluation Techniques 


accordance with student interest and needs: 


Directions: If you agree with the following statements, circle the A; 
if you disagree, circle the D; if you are uncertain, circle 


Directions: Under each season list the number of the 
you would like to have at that time. L 
* wish under each season. 


> PP PrPr>>r SF > PP p PP 


-19 gu oboe 


the U. 
UD 1 I like team sports better than individual sports. 
UD 2. I would like to have bowling put in the phys- 
' ical education program. 

UD 3. Folk dancing is a good way to start boys and 
. girls dancing together. 

UD 4. Ballroom dancing should be taught in school. 

UD 5.I would like to have modern dance taught in 
the ninth grade. 3 

UD 6. When learning ballroom dancing, I would like 
to be in a class that includes boys and girls. 

UD 7. I would like to-play touch tackle instead of 
hockey. - 

UD 8. Apparatus work is a waste of time. 

UD 9. Track is more fun than tennis. 

U D 10. I would like to learn to shoot a rifle. 

UD 1L I enjoy playing badminton. 

UD 12. I would rather have dancing than sports in gym 
classes. 

UD 18. I would like to learn to fence. 

U D 14. If I could get out of taking physical education, 
I would. 

UD 


15. What I learn in gym class will not be of any 


use to me after I get out of school. 


ist as many as you 


. Apparatus work 8. Hockey 15. Soccer 

. Archery 9. Horseshoes 16. Softball 

- Badminton 10. Marching 17. Swimming 

- Basketball 1l. Tumbling 18. Tennis 
Bowling 12. Ping-pong 19. Touch Tackle 
Dancing 13. Riflery 20. Track 

. Exercises 14. Shuffleboard 21. Volleyball 


Fall (Sept., Oct., Nov.) Winter (Dec., Jan., Feb.) 


Spring (Mar., Apr., May) 


activity which 


Questionnaires, Inventories, and Interviews 141 


If there are any physical education activities that we do not 
have, but you would like to have added to our program, list 
them below. 


1. 4. 
2. 5. 
8. 6. 


Àn examination of these questions shows that the same information 
can be used in different ways. While this information is valuable for 
planning the work of the class, it can also be used to reveal back- 
ground data about the student. Also, if the questionnaire were used 
at both the beginning and the end of a school term, the teacher might 
be able to draw, some conclusions about the changes which have 
taken place for each person or for the whole class. 

The fourth main purpose in using a questionnaire is to secure a 
picture of the status of an experience, a unit, a project, a study, or a 
group. For instance, the teacher may want to know, how many boys 
and girls work after school, have home responsibilities, follow hob- 
bies, attend the movies, view television, etc. Such a study of outside 
School activities would be a description of the status quo, conditions 
as they are, among the students. Another type of status study might 
involve questioning the groups on how far each one has progressed 
«in various activities (in- or out-of-school), the reactions of the students 
to each activity, and the plans the students have to continue or dis- 
continue such activities. The questionnaire which follows was used to 
secure a picture of the status of the field experience of a group of 


freshman students preparing to teach. 


COMMUNITY-FIELD EXPERIENCE PROJECT—STATUS INVENTORY 


Council of Social Agencies—Teachers College, University of Cincinnati 
(Volunteer) 
Date— 


Instructor’s Name—4\ 


1, Name. 
Education 1, Section No. 
For what agency did you volunteer? 
Who was your immediate supervisor in this agency?. 


Will the second semester cause any change in your plans to 
volunteer in a social agency? If yes, indicate nature of change. 


SU eo b 


6. How many times have you been to your agency? (Encircle one) 
INO 394550861 7- BMORTOETIIONSISRU T4215) 


Y N 7. Do you feel you have lost interest since you first volunteered? 
If yes, why? 


142 

Y NS 
SN 2 
Y N 10 
Y N d 
Y N 12 
Y N l8 
YO N I4 
Y N 15 
Y N 16 
YN XN 
Y N 18 
X N 19 
Y N 20 
Y N 21 
Y N 22 
Y N 23 


Major Evaluation Techniques 


. Do you find you are spending more time than you should on the 


agency assignment? If yes, is it more than two or three hours 
per week? 


. Do you look forward to your agency experience each week? 


. Do you make any advance preparation for your agency experi- 


ences each week? 


. Do you feel as though you are needed in your job with the 


agency? 


. Do you have the feeling you are a part of the agency staff? 
. Do you feel that the agency has helped you to become a part 


of the staff? 


. Do you feel that you are getting all the help and supervision you 


need to carry out your job successfully in the agency? Why? 


. Do you feel you are getting adequate cooperation from the 


agency people in carrying out your work? Explain. 


. Have you had any training while you were on your agency job? 


What type? Adequate? 


- Do you feel you need any additional training for your agency 


job? If yes, what kind? If not, why? 


. Are there any factors in your agency job, including agency 


policy, which you think need modification? Explain specifically 
and indicate what improvement you think is possible. 


. Do you feel that there are proper channels for offering criticisms 


and suggestions in your agency? Illustrate. 


. Do you feel that participation in the agency project has defi- 


nitely helped you attain what you originally desired and needed? 
Explain your answer. 


. Do you feel that your agency experience is contributing to your 


preparation for teaching? Explain your answer. 


. Do you feel that you would like to continue your agency ex- 


perience next year? 


. Do you feel after this experience that you would like to con- 


tinue as a volunteer youth leader after you graduate from 
college? 


On the back of this sheet please write any suggestions which you have not 
mentioned already for improving the social agency field experience. 


Construct Appropriate Questions Since the main purpose of a ques- 
tion is to secure from the respondent a valid and reliable answer, 


Questionnaires, Inventories, and. Interviews 143 


factors which promote this goal should be carefully observed. Some 
of the typical kinds of errors made in question construction might 
be examined. 

In the first place, a question which is not clear to the reader is 
unlikely to yield a proper response. Lack of clarity may be due to 
vocabulary difficulty, sentence complexity, the use of ambiguous 
terms, and blurred printing of questions. Stated positively, a clear 
question is one whose (a) vocabulary is understandable, (b) phras- 
ing is simple and straightforward, (c) terms are unequivocal, and (d) 
print is readable. 

In the second place, a question may not be valid in a questionnaire 
when it is double- or triple-barreled. For instance, in response to a 
question “Do you enjoy playing Bridge, Poker, and Canasta?" the 
person who says "No" may mean that he enjoys one or two of the 
games but not all three. If each game had been listed separately, 
there might have been two “yes” responses and one "no" response. 
The weakness in a double-barreled item lies in the difficulty of inter- 
preting the response to parts of the whole question. The illustrative 
question should be phrased “Do you enjoy playing Bridge ( ) 
Poker ( ), Canasta (. )? (Check).” 

A third weakness in constructing questions is that of confining the 
respondent to a choice which does not describe his position. Suppose 
Mary had to answer the question, “When you sew, do you prefer a 
sewing machine ( ) or a needle ( )? (Check one).” If Mary did 
not know how to sew, she really could not answer the question be- 
cause the choices are not suitable for a non-sewer. Under such cir- 
cumstances, one would have to ask “Do you sew? Yes ( .), No ( se 
If the answer is “yes,” one could be requested to answer the prefer- 
ence item “Do you prefer a sewing machine ( ) ora needle ( )?” 

A fourth weakness is to include too many questions. A few well- 
constructed, important questions are superior to a large number of 
unimportant questions. There is always a temptation on the part of 
the person constructing à series of questions to include many “in- 
teresting" questions because there is space available. Such practices 
tend to disturb the respondent who, feeling perhaps that too many 
unimportant questions are being asked, may fail to answer all items 
seriously. Research has shown, however, that interest and willingness 
to answer are more important than length in determining the num- 
ber of replies which will be received to a questionnaire (5). 

A fifth major weakness in question construction is the failure to 
consider how the item will be tabulated. For instance, to the item 


144 Major Evaluation Techniques 


“How old are you? ( )” the respondents might write in 8, 10, 12, 
13, 14, 18, etc. When it is necessary to tabulate the answers to such 
a question, it may be advisable for one to group the ages, possibly 
8-11, 12-15, 16-19. In that case, it would have been easier to have 
the individual respond to the item "check the age group to which you 
belong, 8-11 ( ), 12-15 ( ), 16-19 ( ).” This question is both 
easier to tabulate than the previous one and subject to less error in 
classification and tabulation. When one knows or has decided how the 
data to any given question will be tabulated, the question can be 
fashioned accordingly. In the case just cited, if three age groups were 
going to be compared, it would be best to frame the age question in 
the latter form, which permits the tabulator to sort the three age 
groups from one another at a glance, without thinking about the 
individual age and the group to which it belongs. To summarize, 
questions should be clearly stated, simply constructed, worded to 
encourage valid free responses, selective as to importance and rele- 
vance to the purposes, and written to facilitate tabulation and pres- 
entation of findings. a 

In the process of constructing questions for the questionnaire, it is 
advisable to write each question on one side of a separate index card 
or sheet of paper. If one person is constructing the items, he can write 
whatever questions he can think of, without attempting to refine or 
polish phrases for grammar, and just write until he is out of relevant 
ideas. Then he can go over each of the items for editing and remove 
duplicates, etc. When two or more persons are cooperating in the 
construction of a questionnaire (something which should be encour- 
aged because others can contribute more ideas and good criticism), 
each can jot down items separately and share them. Following this, 
each person might take those items which deal with an area in which 
he has special interest or knowledge and work those questions into 
polished form. ; 

After the questions have been approved, each question should be 
numbered. This simple procedure is important for two reasons. First, 
when any individual (peer or respondent) wants to comment on the 
item, the number beside the question permits everyone to turn to the 
right item immediately. Second, there is less likelihood of confusing 
‘items in tabulation if each question is separately numbered. If there 
is a question 8 on page 1 and a question 3 on page 2, it is possible to 
confuse tabulated data—or in any event, it takes more trouble to avoid 
such confusion. Numbering all questions separately, therefore, is rec- 
ommended for efficient and accurate criticism and tabulation. 


Questionnaires, Inventories, and. Interviews 145 


Arrange Questions in Appropriate Groupings There are three main 
reasons for grouping questions dealing with the same points or areas. 
It seems that when a respondent directs his attention to any question, 
he is mindful of factors relating to that question. Consequently, so 
long as one has a “mind-set” toward a particular group of facts, it 
would be in the interest of efficiency (time expended recalling an- 
swers to questions) to group similar items. The principle of simi- 
larity extends not only to the construction of the item as to type 
(multiple-choice, true-false, recall, etc.) but also to areas being sam- 
pled. A second reason for grouping items is to make the tabulation 
more systematic and interpretation of the question simpler. Finally, 
keeping similar items (or content) together permits one to see more 
readily whether any important points or questions are being omitted 
or whether unimportant questions have been retained. 


Design the Format with Appeal The design of the questionnaire is 
an important consideration, one which teachers tend to neglect be- 
cause they are so concerned with the questions and their answers. 
Actually, poor design creates attitudes in the respondent which mili- 
tate against the collection of valid and reliable data. A crowded- 
looking sheet of paper which lists question after question in monoto- 
nous style does not encourage the reader to respond willingly and 
completely. On the other hand, a well-printed, well-spaced, and at- 
tractive-looking questionnaire encourages the respondent to answer 


questions fully and with interest. , i 
What comprises a well-designed questionnaire? Without too much 


detail, the following recommendations are pertinent: 


a. Genter the title of the questionnaire. Make the title look 
good and sound good. The title should be clear, concise, 


and descriptive of the project. 


b. If there is an interested group or class or committee ad- 
ministering the questionnaire, center its name under the 


title. 

c. If there is special im 
season) or year of the study, 
title. — 

d. If there are no written instructions provided on a special 
sheet and if the questionnaire is being administered orally, 
there may be need for a few lines devoted to explaining 
the purpose and use of the questionnaire and its results. 


portance attached to the month (or 
include the date, etc., in the 


146 Major Evaluation Techniques 


e. Usually, basic data of a descriptive nature are collected 

first. One asks for name, age, class, standing, sex, address, 
, etc., at the beginning of the questionnaire. These questions 
should all be grouped together. Plenty of space should be 
provided to allow for clearly-written answers. 

f. Adequate room for answers to questions should be pro- 
vided. An important question which should stimulate the 
respondent to answer fully may need eight spaces, while 
a relatively unimportant question may need only two 
spaces. Sometimes the nature of the question rather than 
its importance may dictate the size of the open space. A 
common error is to provide too little or too much space 
for an answer to a question. 

g. To save time and to appeal to the eye, questions of a 
similar nature may be grouped together, (See items 9-14, 
and 15-17, of the illustrative questionnaire on page 142.) 


Check the Questionnaire for Adequacy Since most human beings, 
including teachers and research investigators, are liable to make seri- 
ous or simple mistakes, it is always advisable, before mimeographing 
or printing, to check the questionnaire for adequacy. Spelling, gram- 
matical, and typographical errors should be eliminated. However, 
before cutting final stencils, or setting type, it is advisable to seek 
criticism from interested colleagues and from a small group who are 
similar in most respects to the group which will respond to the final 
draft of the questionnaire. Encourage these persons to raise ques- 
tions when they do not understand the vocabulary or the meaning 
of various questions. After passing this hurdle, ittis generally ad- 
visable to engage in sample tabulations, so that any inadequacies 
will show up before the final draft of the questionnaire is administered 
to the main group. If a question is not eliciting the kind of informa- 
tion it was designed to collect, improvements can be made before 
final copy is approved. 


ADMINISTERING THE QUESTIONNAIRE 


Although it would appear that administering a questionnaire is as 
simple as "handing it out and collecting it,” there are a number of 
conditions which are necessary for good administration. These con- 
ditions are (1) insuring a good climate for proper administration, 
(2) stating clear purposes, .and (3) providing clear directions and a 
good working situation. 

Insufficient attention is generally paid to various intangibles, such 
as the attitude of the administrator and the atmosphere of the class- 


Questionnaires, Inventories, and Interviews 147 


room. The teacher who indicates to the group that the answers to a 
questionnaire are important and possibly confidential (if they are to 
be so held) can expect students to respond to questions with a seri- 
ousness demanded by the device. If the teacher exclaims “Here is 
something we must collect, so let’s get it over with as fast as possible,” 
there is little likelihood that the results will be worth much. If the 
students trust the teacher, one can expect honest and complete re- 
sponses; if not, there is little chance that students above the sixth 
grade will provide information of a self-incriminating or self-reveal- 
ing nature. d 

This leads to the second point, the atmosphere of the classroom. A 
“permissive” classroom is one in which the students feel that the 
teacher is functioning primarily as an interested and helpful guide. 
The students are free to ask the teacher questions which may seem 
irrelevant or which reveal personal inadequacies or ignorance; the 
teacher answers questions patiently and pleasantly; the teacher and 
students share experiences and ideas and always look to each other's 
well-being without undue judgment or vindictiveness when things 
or people go "wrong." It is probable that students who answer ques- 
tionnaires in such a classroom will be more likely to provide valid 
and reliable responses than those in a classroom characterized by 
authoritarian controls and practices. The attitudes of both the in- 
structor and student and the atmosphere of -the classroom form the 
background into which the questionnaire is brought for administra- 
tion. Attention to this background should improve the results ob- 
tained through questionnaire administration. 

Should questionnaires be signed or anonymous? It is frequently 
stated that a person is more likely to be honest if he cannot be iden- 
tified, but the research evidence on this point is not conclusive. The 
teacher, then, must deal with this problem according to the demands 
and nature of the situation. If honesty in response makes the student 
subject to recrimination, it seems unlikely that students in any large 
numbers would risk being honest, with or without signing their name 
to a questionnaire. While no authoritative conclusion concerning un- 
signed vs. signed questionnaires is available, the teacher is counseled 
to have signed questionnaires. If students do not trust the teacher or 
do not want a teacher to know something about them, it is unlikely 
that unsigned questionnaires will elicit such information. If students 
trust the teacher, they will be happy to tell him what he needs to 
know. It should be noted, too, that follow-up work with individual 
students is impossible when questionnaires are not signed. 

Stating the purpose of the questionnaire also leads to better results. 


148 Major Evaluation Techniques 


Administrators tend to overlook the importance of stimulating stu- 
dents to respond clearly, honestly, and fully. When students are told 
the reasons why a questionnaire is being given, assuming these rea- 
sons are important and therefore convincing, the students will be 
highly motivated. 

Clear directions and good working conditions must be provided for 
those taking the questionnaire. Whether the directions are oral, writ- 
ten, or a combination of both, the respondent must know how each 
question is to be answered. 

Typical things to check in the administration of a questionnaire 
do not differ from ordinary requirements of a good physical set-up 
in a classroom: adequate lighting, proper temperature and ventila- 
tion, suitable chairs and desks, freedom from disturbances and noise, 
a sufficient supply of pencils and erasers, space for the proctor or 
teacher to walk around each student (to insure that directions are 
being followed or to answer any questions raised), and provision 
for students to leave without disturbing those who are not finished. 
When a student says he is finished, it is wise to check over the answers 
in the presence of the student so that if there are omissions, the stu- 
dent can answer the omitted questions immediately. Otherwise, one 
typically finds a situation where students have neglected to answer 
questions, and the tabulation, as a result, becomes incomplete or 
uneven. 


The Interview 


THE NATURE OF THE INTERVIEW 


The interview is an extremely valuable means of conducting evalua- 
tion in the school. Not only can the interview be used as an informa- 
tion-gathering device in the same manner as the questionnaire, but 
the teacher can employ the interview to diagnose student behavior 
and to appraise the achievement of an individual. 

Probably the greatest advantage of the interview over the ques- 
tionnaire is its flexibility. It allows the interviewee to ask for clarifi- 
cation of a question and the interviewer to raise all kinds of follow-up 
questions to the answers of the respondent. Sometimes, the responses 
which are given may stimulate the interviewer to raise questions 
which he had not thought of at the start of the interview. 

Another important characteristic and advantage of the interview 
is the opportunity it gives the interviewer to hear how an interviewee 
has said something as well as what he has said. The answers to a 


Questionnaires, Inventories, and Interviews 149 


questionnaire may not include such important background informa- 
tion about a person as accent, fluency, tone of voice, meaning, and so 
on. The student who has said “I sure would hate it" might be saying 
“Pd love to do it” if the interviewer observed his wink, his smile, and 
the way he said it. These important aspects of the individual often 
can be obtained only by the interview method. 

While it is true that the main function of evaluation is to appraise 
.educational objectives, it should not be forgotten that the process 
itself, including the means employed, may be educative. Teachers and 
students are usually unaware of the benefits of interviews for teach- 
ing or learning purposes. An interview on how Johnny solves an arith- 
metic problem may reveal to the teacher what techniques of thinking 
Johnny employs, but at the same time Johnny may be learning more 
arithmetic. In this connection teachers need to be conscious of the 
fact that interviews are not necessarily “formal,” with the trappings 
of appointments, special interviewing rooms, etc. The informal inter- 
view with a student in the classroom is a neglected means for diagnos- 
ing student behavior. Quietly chatting with a student at his seat, the 
teacher is in a position to gather valuable information about the pupil. 

Many times the interviewer is so concerned with the answer to his 
questions that he fails to take advantage of the interview for its value 
along other lines. One college teacher required each member of his 
class to be interviewed at least three times during the term. During 


one of the interviews, the instructor noticed the student tearing off 


pieces of her fingernails and fighting back tears. While there was no 
he answers to the questions on class- 


apparent relationship between t 
al symptoms, an unplanned question 


room progress and the emotion 
about the evidences of tension brought forth an important story of 


familial difficulties. As in this case, one advantage of the interview 
is that confidential and personal materials can be handled so as to 


help students grow emotionally as well as intellectually. 
the interviewer and the inter- 


This inter-relationship between 
viewee, while a great advantage, may also be a disadvantage. There 
are. always chances that time will be wasted, unnecessary questions 


raised, irrelevant materials covered, or that a constructive person- 
to-person relationship cannot be sustained because of the personality 


of either the interviewer or interviewee. 


TYPES OF INTERVIEWS 
In recent years, the respective followers of Darley and Rogers have 
Carried on a running controversy concerning the type of interview 


150 Major Evaluation Techniques 


to be employed, particularly in the field of counseling. Darley (3) 
has stressed that the interviewer, being well informed, is in a better 
position to guide the interviewee than the relatively ignorant and 
confused interviewee himself. Consequently, Darley is considered 
the advocate of a directed. interview. The directed interview is one 
in which the interviewer asks "directing" questions and makes sug- 
gestions to the interviewee. 

Rogers (4) and his followers place stress on the non-directive tech- 
nique of interviewing. Although the interviewer asks certain ques- 
tions and, to that extent, guides the interview, stress is placed on 
helping the interviewee gain self-insight. The assumption is that if 
the interviewee faces his problems on his own, and develops intelli- 
gent answers or approaches during the interview, he gains independ- 
ence and self-insight. In this sense, the interviewer does not “tell” 
the interviewee or counselee what to do. 

For the elementary student in the field there may be no need to 
take sides. Which type of interview is more useful depends on the 
contents of the interview and on the maturity and understanding of 
the interviewee. As Wrenn (7) points out, most interviews are, in all 
probability, a combination of directive and non-directive techniques. 
At a time when people are becoming conscious of the meaning and 
implications of a democratic philosophy, it is not surprising to find 
educators sensitive to anything which seems to involve authoritarian 
techniques of imposing solutions on people without respecting their 
personalities and permitting them to arrive at their own decisions. 


TECHNIQUES OF INTERVIEWING 


There are at least three main aspects to the technique of inter- 
viewing. First, various parts of the interview involve different pur- 
poses. Second, there are certain mechanics related to the raising of 
questions. Third, there is the problem of recording results. Without 
going into too much detail on these three matters, what principles 
should be observed in interviewing? 


Parts of an Interview Although interviews may be broken down in 
various ways, there is always a beginning, a middle, and an end. 
The main goals at the beginning entail establishing rapport and clari- 
fying the purposes of the interview. For the former, depending on 
the time, place, and person, the interviewer may just smile pleasantly 
and greet the person by name or may offer some refreshment. The 
purposes of the interview, of course, are set by the person who initiates 


Questionnaires, Inventories, and. Interviews 151 


the interview. Sometimes the teacher and student decide together 
what is going to be taken up. 

The second part of the interview involves carrying out the pur- 
poses of the interview. If the purpose is to gather background mate- 
rial on a’ person or to explore the various alternatives to solving a 
problem, this stage of the interview constitutes a question-and-an- 
swer session. If the purpose is to help a student gain personal in- 
sight, find out the kinds of books he should read, or the experiences 
he needs, then the gaining of such insights on the part of the inter- 
viewee is the goal of the middle of the interview. 

The end of the interview involves terminating the face-to-face 
relationship. Toward the close of the interview, either the interviewer 
or interviewee may attempt to summarize what has been learned or 
what must be done, etc. The actual leave-taking may involve setting 
up an appointment for the next interview or any other condition 
appropriate to follow-up. Of course, the usual social amenities are 
followed to insure continued rapport between pupil and teacher. 


The interview may be conducted in a variety 
of ways. When there is an attempt to have a structured or controlled 
interview covering definite ground, schedules of questions are gen- 
erally drawn up beforehand, and made available to the interviewer 
alone or to both the interviewer and interviewee. Under circumstances 
where the interview is less controlled or less structured, one may give 
the interviewee a list of topics only and ask questions about them. In 
an unstructured interview, conditions are least controlled. The inter- 
viewer will raise various questions orally with the interviewee and 
take advantage of the flexibility and informality of the situation. 


Securing Responses 


When an interview lasts for approximately one 
Il have been discussed, and it is obvious 
ecessary. Attempts have been made 
to record interviews by using outside observers and by having the 
interviewer take notes. The main argument against taking notes dur- 
ing the interview is the adverse affect it may have on the interviewee. 
Writing notes tends to break down rapport and the continuity of the 
interview. It is, therefore, customary to find the interviewer sum- 
marizing the interview as soon as it is over by jotting down the most 
important points and re-creating the questions and answers. Covner 
(1), however, compared interviewers’ reports written after the inter- 
h recordings of the interview situation. He dis- 


Recording Results 
half hour, many matters wi 
that some type of record will ben 


view with phonograp 


152 Major Evaluation Techniques 


covered that only 10 to 35 per cent of the material included in the 
recorded interviews was written up by interviewers who did not use 
some means of note-taking. 

To be sure, the purpose and further use to which the interview 
materials will be put determine the nature of the records to be kept. 
It must be remembered that the ordinary teacher does not have the 
time to make extensive records or to utilize the results of such records 
when his classes are large. At the present time, it is likely that the 
teacher will take brief notes during the course of the interview and 
summarize the main points after the interviewee leaves. 

Recording equipment is increasingly being used to put interviews 
in more permanent form for subsequent analysis or use (2). The use 
of record discs is not as economical as tape recordings, for the former 
are permanent while the latter can be erased and re-used. However, 
recorded interviews may be invalid because the interviewee may be 
under tension in the presence of a microphone. While using concealed 
microphones is ethically questionable, it is common practice to explain 
to the interviewee why a recording is being made and to secure his 
permission to record. 


Summary 


Questionnaires, inventories, and interviews are techniques for gath- 
ering data by obtaining responses to questions. Questionnaires and 
inventories are similar in that they constitute paper-and-pencil ap- 
proaches, embodying similar questions and purposes. The interview 
is a face-to-face personal relationship in which greater flexibility in 
questioning is possible. The interview also lends itself to dealing with 
confidential and personal material which cannot be obtained through 
the questionnaire. 

In constructing questionnaires and inventories, several important 
points should be observed. These are (1) use the questionnaire when 
it is most appropriate, (2) define the purposes and aims, (3) con- 
struct appropriate questions, (4) arrange questions into appropriate 
groupings, (5) design the format with appeal, and (6) check the 
questionnaire for adequacy. In the administration of a questionnaire 
or inventory, one should insure proper conditions of administration, 
state clear purposes to the students, and provide them with clear 
directions. 

Although the interview may be conducted in several ways, depend- 
ing upon the degree of flexibility which is desired, every interview 


Questionnaires, Inventories, and Interviews 153 


has a beginning, a middle, and an end. The beginning of the inter- 
view is given over to establishing rapport and clarifying the purposes 
of the interview; the middle to data-gathering or exploration of alter- 
native solutions to a problem; the end to termination of the inter- 
view. In most instances, the interviewer will-find that a written sum- 
mary of the main points, based upon notes taken during the course 
of the interview, will serve as an adequate record. 


Problems for Class Discussion 


1. Below you will find a questionnaire used in a survey of certain interests 
of sixth-grade pupils. What are the major weaknesses of this question- 
naire? How would you reframe the questionnaire in order to eliminate 


these weaknesses? 


‘A NEWSPAPER; RADIO, TELEVISION, MOVIE AND MAGAZINE 
QUESTIONNAIRE 


Roosevelt School—Sixth Grade Survey 


d magazines are enjoyed by students in our 
e affected by your answers, so please be 
frank. Follow directions given for each item below by circling Y (Yes) or 
N (No) or by writing in answers to questions as fully as possible. The re- 
sults will be shared with all 210 students in the seven different sixth-grade 
groups in order to reveal our favorite reading matter or recreational program. 
Date 


Television programs, movies, an 
sixth grade. Your marks will not b 


Name- 
Address 
Sex— Age— —— 


Y N 1. a. Do you enjoy reading 
a newspaper regu- 


Homeroom Teacher. 


Y N 2. a. Do you enjoy listen- 
ing to the radio? 


larly? b. Name your three fa- 
b. Which newspaper do vorite radio programs 
you enjoy reading is 
regularly? 2: 
(Check one or 3. 
( lass Trib- Y N 8. a. Do you enjoy watch- 
ane ing television pro- 
( ) Chicago Daily grams? 
News b. Name your three fa- 
( ) Chicago Her- vorite television pro- 
ald-American grams. 
liec (Add others) 1. 
7 2. 


3. 


Y N 4. a. Do you erjoy going 


to the movies? 


b. Name the three mov- 
ies you enjoyed most 
in the past year. 

1. 


2. 
3. 


. If you could have only 


one of these five activ- 
ities (newspaper, radio, 


Major Evaluation Techniques 


Y N 5. a. Do you enjoy reading 


magazines? 


b. What three maga- 
zines are your favor- 
ite? 

Ls 
2, 
3. 


7. Write a few reasons 
why you can least do 
without the activity 


television, movie, and 
magazine) which one 
would be hardest for 
you to give up? 


mentioned in your an- 
swer to question 6. 


Construct a questionnaire designed to elicit information concerning the 
extent to which a junior-high-school group is aware of and utilizes avail- 
able resources in your local community. 


- Assume that your local school system wishes to undertake a study of the 


careers of its former students for a five-year period after their graduation 
from high school. Prepare a schedule of questions which might be used 
as a basis for a controlled interview with each graduate. 


References Cited. in This Chapter 


- Covner, Bernard J., “A Comparison of Counselors’ Written Reports with 


Phonographic Recording of Counseling Interviews.” Unpublished Ph.D. 
thesis, Ohio State University, 1942. 


- Covner, Bernard J., “Studies in the Phonographic Recording of Verbal 


Material: 1. The Use of Phonographic Recordings in Counseling Practice 
and Research," Journal of Consulting Psychology, 6:105-118, March- 
April, 1942. 

Darley, John G., The Interview in Counseling. Washington: Retraining 
and Reemployment Administration, U. S. Department of Labor, 1946. 


Rogers, Carl R., Counseling and Psychotherapy. New York: Houghton 
Mifflin and Co., 1943. 


Toops, Herbert A., “Predicting the Returns from Questionnaires: a Study 
in the Utilization of Qualitative Data,” Journal of Experimental Educa- ' 
tion, 3:204-215, 1935. 


- Traxler, Arthur E., Techniques of Guidance. New York: Harper & 


Brothers, 1945. 


- Wrenn, C. Gilbert, “Client-Centered Counseling,” Educational arid Psy- 


chological Measurement, 6:439—444, 1946. 


Questionnaires, Inventories, and. Interviews 155 


References for Further Reading 


Koos, Leonard V., The Questionnaire in Education. New York: Macmillan 
Co., 1928. 
The first comprehensive study of the use of the questionnaire in educa- 
tion. Many of the points made about early questionnaires are still valid. 
Bingham, Walter V., and Moore, Bruce V., How to Interview. New York: 
Harper & Brothers, 1941. 
An extensive and authoritative work in the field. 
Torgerson, Theodore L., Studying Children. New York: Dryden Press, 1947. 
Contains many illustrative checklists and questionnaires which help 
clarify the meaning of pupil behavior. 


cHAPTER NINE | Checklists and Rating Scales 


Checklists and rating scales are similar types of evalu- 
ative instruments. The checklist, as the name literally indicates, is a 
selected list of words, phrases, sentences, or paragraphs following 
which an observer records a check (#) to denote the presence or 
absence of whatever is being observed. The checklist may include 
items which represent expected desirable or undesirable forms of 
behavior, a sequence of skills associated with a given operation, or a 
group of ideas. The rating scale is a selected list of words, phrases, 
sentences, or paragraphs following which an observer records a value 
or rating based upon some objective scale of values. The funda- 
mental difference between the checklist and the rating scale, there- 
fore, lies in the use of the latter as a means for quantifying judgments 
about observations. 
This chapter, after a discussion of the school uses of these instru- 
ments, considers various types of checklists and rating scales and ways 
of constructing them. 


Uses of Checklists and Rating Scales 


Checklists and rating scales are commonly used in schools for 2 
variety of reasons. Among the more important values associated with 
the use of these instruments are: (1) promotion of good teaching; 
(2) assistance in curriculum planning, and (3) improvement in ad- 
ministration. 


Promotion of Good Teaching There are numerous ways in which 
.the intelligent use of checklists and rating scales can promote good 


teaching. Both teachers and pupils are-interested in the degree of 
156 


d 


Checklists and Rating Scales 157 


achievement attained, not only in the final product (an essay, a paint- 
ing, a wooden tie-rack, etc.), but also in the degree to which methods 
of working (process values) have been mastered. Checklists and rat- 
ing scales can be used in evaluating the work process and products 
of individuals or groups of children in school. These techniques help 
diagnose strengths and weaknesses in the ways in which a student 
a goes about his work and in the quality of the completed work. (An 
excellent illustration of these uses may be found in the checklist on 
microscope techniques shown on pages 160-162). Checklists of com- 
mon faults, used as a basis for prevention or correction, may help 
students overcome barriers to their continued school progress. 
Another important value in using such tools is that pupils come to 
learn what is expected of them. Cooperative teacher-pupil planning 
should result in the development of goals to be attained by individual 
pupils and by groups. A checklist of such goals can serve both as a 
guide to learning and as a means of determining what has been ac- 
complished at the end of the school year. The development of a check- 
foster educational values associated with 


list of purposes thus can 
group discipline based upon common un- 


cooperative work and self- 
derstanding and agreement. 

Pupil formulation of checklists and rating scales may serve as a 
he development of the ability to discriminate 
By rating books, movies, comics, radio 
and other interests, pupils practice the 
arts for récording progress in master- 
al manners, etc., kept up to date by 
the pupil under teacher guidance, help the student learn what con- 
stitutes acceptable behavior. Of course, the great pitfall for the teacher 
to avoid is the rote learning of the items on a checklist in formal 


drill style. 


means of encouraging t 
and of fostering good habits. 
and television programs, trips, 
art of selection. Individual ch 
ing skills in health, speech, soci 


Assistance in Curriculum Planning Checklists and rating scales may 


be used profitably in curriculum planning and evaluation. It may be 
desirable to have classes and teachers who follow common curricula 
d most about in the way of con- 


indicate what they enjoyed or learne 
tent, and what methods of teaching were most successful. Those re- 
might then use such faculty and 


sponsible for curriculum planning A pue 
student opinions and judgments as the basis for making improvements. 


Checklists or rating scales (in the form of a score card) are some- 
times used to rate such things as books, pamphlets, and magazines, 


158 Major Evaluation Techniques 


when these are to be selected for the new school year. Since limited 
school budgets necessitate careful selection of school materials, the 
use of such devices is of great value. 


Improvement in Administration Administrators can use checklists 
and rating scales in two ways. First, some of the material which ulti- 
mately is recorded on the cumulative record card for the individual 
student may have as its origin a teacher or pupil checklist used in the 
classroom. Second, checklists can be used to rate various aspects of a 
school. Illustrating this approach are the Evaluative Criteria em- 
ployed in the Cooperative Study of Secondary Schools (1). 


Types of Checklists 


Checklists and rating scales take many different forms. A few illus- 
trations at various school levels of the innumerable varieties of such 
instruments are given below as suggestive models for the teacher. 

The following is a portion of a checklist which was used for teach- 
ing and research purposes in a study carried on cooperatively by the 
Rochester, N. Y., school system and the staff of the Evaluation of 
School Broadcasts, Ohio State University. 


RADIO CHECKLIST 


. Boy. 
Name Girl 
(print) LAST FIRST (check one) 
Age—---—— Birthday. Grade Teacher. 
MONTH DAY 
Déloohe———— — y State. 


Directions: Most boys and girls like to listen to radio programs at home. Do 
you listen to any of the programs in the following list? Mark those programs 
with a 


1—if you like the program so much you don't want to miss it. 
2—if you like to listen to the program sometimes. 
8—if you listen to the program when you have nothing else to do. 


Write in the names of other programs that you think should be included in 
this list, and mark them also with a 1, 2, or 3 to show how much you like 
them. 


Checklists and Rating Scales 159 


MIXED PROGRAMS DRAMATIC PROGRAMS 
— Jack Benny — Silver Theatre — Jack Armstrong 
— Charlie McCarthy — Screen Guild — Little Orphan Annie 
— Johnny Presents Theatre —  Scattergood Baines 
— Bob Hope — Campbell Playhouse __ Amos and Andy 
— Al Pearce — Lux Radio Theatre — Lum and Abner 
— Fred Allen — Texaco Star Theatre _ Blondie , 
— Major Bowes — Hollywood — Lone Ranger 
— Avalon Time Playhouse — Vic and Sade 
— Burns and Allen — Columbia — Tom Mix 

Workshop 


— First Nighter 
— Grand Central 


Station 
CRIME AND MYSTERIES QUIZ PROGRAMS COMMENTATORS 
— The Shadow — True or False — Walter Winchell 
— Ellery Queen — Doctor I.Q. — Paul Sullivan 
— Sherlock Holmes — Information Please — Edwin C. Hill 
— I Love a Mystery — Battle of Sexes — Elmer Davis 
— Mr. Keen —— Kay Kyser — H. V. Kaltenborn 
— Big Town — Professor Quiz — Raymond Gram 
— Gang Busters — Don't Forget Swing 
A E — Fulton Lewis 
PEOPLE, PLACES, 

EVENTS CLASSICAL MUSIC DANCE MUSIC 
— Pilgrimage of Poetry — Ford Sunday — Henry Busse 
— The World Is Yours Eve. Hour — Guy Lombardo 
— Hobby Lobby — Voice of Firestone ^ . Benny Goodman 
— Vox Pop — Rochester Civic — Eddy Duchin 

; Orch. ; 
— Americans at Work — Horace Heidt 
E — NBC Sym. Orch. 
— What Price America — Hal Kemp 
b — New York C ^ 
— Democracy in Philharmonic — Count Basie 
Action des 


— Metropolitan Opera 


— Cities Service 
Concert 


160 Major Evaluation Techniques 


y 

The sequence type of checklist is illustrated by the following, which 
was used for diagnostic purposes in college botany classes (9). The 
entire checklist is given in order to illustrate the degree of complete- 
ness which may be achieved in evaluating both process and product. 


CHECKLIST OF STUDENT REACTIONS IN FINDING 
AN OBJECT UNDER THE MICROSCOPE 


senes Name ee ao a Time Begun. 


Section Time unished—*— — = 
Date. Time Consumed. 
DIRECTIONS 


On the microscope table are a microscope, yeast culture, or other suitable 
material, slides, covers, cloth, and lens paper. Direct the student to find a 
cell or other object under the microscope and show it to you. Time him in 
seconds from the time he receives the directions. Trace his actions by 
placing a figure 1 after his first action, a figure 2 after his second action, 
and so on in the order of his performance. Characterize his behavior and 
his mount by checking appropriate terms from the lists given below. 

Add any additional comments in the blank on this page. In summarizing 
the student’s actions the instructor may wish to suggest skills in which the 
student should receive additional training by checking the appropriate items 
in the list of skills in which student needs further training. 


SEQUENCE SEQUENCE 
OF OF 
STUDENT'S ACTIONS ACTIONS STUDENT'S ACTIONS ACTIONS 
a. Takes slide — o. Places slide on stage — 
b. Wipes slide with lens paper — p. Looks through eyepiece 
c. Wipes slide with cloth LI with right eye — 
d. Wipes slide with finger 2s q. Looks through eyepiece 
e. Moves bottle of culture with left eye E 
along the table de. r. Turns to objective of 
f. Places drop or two of lowest power -= 
culture on slide es s. Turns to low-power 
g. Adds more culture = objective — 
h. Adds few drops of water — t Turns to high-power 
i. Hunts for cover glasses — objective ic 
j Wipes Gover glass with u. Holds one eye closed = 
lens paper ee v. Looks for light = 
k. Wipes cover with cloth =, w. Adjusts concave mirror = 
l Wipes cover with finger x x. Adjusts plane mirror = 
m. Adjusts cover with finger = y. Adjusts diaphragm = 
n. Wipes off surplus fluid 4 z. Does not touch diaphragm — 


Checklists and Rating Scales 


SEQUENCE 


STUDENT'S ACTIONS 
aa. With eye at eyepiece turns 
down coarse adjustment 
ab. Breaks cover glass 
ac. Breaks slide 
ad. With eye away from eye- 


piece turns down coarse 
adjustment 


ae. Turns up coarse adjustment 
a great distance 


af. With eye at eyepiece turns . 


down fine adjustment a 
great distance 


ag. With eye away from eye- 
piece turns down fine ad- 
justment a great distance 


ah. Turns up fine adjustment 
screw a great distance 


ai. Turns fine adjustment 
screw a few turns 


aj. Removes slide from stage 

ak. Wipes objective with lens 
paper 

al. Wipes objective with cloth 

am. Wipes objective with 
finger 

an. Wipes eyepiece with lens 
paper 

ao. Wipes eyepiece with cloth 

ap. Wipes eyepiece with finger 

aq. Makes another mount 

ar. Takes another microscope 

as. Finds object 

at. Pauses for an interval 

au. Asks, "What do you want 
me to do?” 

av. Asks whether to use high 
power 

aw. Says, “I’m satisfied” 

ax. Says that the mount is all 
right for his eye 


ACTIONS 


STUDENTS ACTIONS 


161 


SEQUENCE 
OF 
ACTIONS 


ay. Says he cannot do it =. 


az. Told to start a new mount — 
aaa. Directed to find object 


under low power — 


aab. Directed to find object 


Ec 


monnocse9 


under high power = 


NOTICEABLE CHARACTERISTICS 
OF STUDENT'S BEHAVIOR 


. Awkward in movements — 
. Obviously dexterous in 


movements LI 
Slow and deliberate I 


. Very rapid =s 


Finger tremble as 


. Obviously perturbed == 
. Does not take work seriously — 


Obviously angry = 


. Unable to work without 


specific directions == 


j. Obviously satisfied with his 


unsuccessful efforts =. 


CHARACTERIZATIONS OF THE 
STUDENT'S MOUNT 


. Poor light PE 
. Poor focus € 
. Excellent mount i 


Good mount — 
Fair mount -= 


Poor mount a, 


. Very poor mount = 
. Nothing in view but a 


thread in his eyepiece z- 


. Something on objective as 


Smeared lens = 


. Unable to find object a 


162 


SKILLS IN WHICH STUDENT 
NEEDS FURTHER TRAINING 

In cleaning objective 

. In cleaning eyepiece 

In focusing low power 

. In focusing high power 


enone 


. In adjusting mirror 


Major Evaluation Techniques 


SKILLS IN WHICH STUDENT 
NEEDS FURTHER TRAINING 
f. In using diaphragm — 
g. In keeping both eyes open —— 
h. In protecting slide and ob- 


jective from breaking by 
careless focusing — 


ADDITIONAL COMMENTS 


In a course in Home Mechanics for junior-high-school boys and 
girls, this checklist was used to evaluate the final products of student 


work. 


BULB SOCKETS AND PLUGS 
ANALYSIS CHART 


Name- 


Date— 


1. Insulation: 
Ragged ends 


Particles left 


2. Wire: 
Strands cut 


Corroded 
Too long 
Too short 
Spread 


8. Terminals: 
Direct to terminal 


Wire counter- 
clockwise 


Loose 


4. Operation: 
No results 


Fuse blown 


COMMENT: 


Clean cut 
All removed 


Intact 

Clean 

Correct length 
Twisted 


Wrapped around 
prong to terminal 


Clockwise 


Tight 


Bulb lights 


CHECKED BY. 


Simple illustrations of abbreviated checklists containing a group of 
"expected desirable forms of behavior" in the area of health habits 
are reproduced on the opposite page. 


Checklists and Rating Scales 163 


Check whether the pupil has good health habits for each of the fol- 
lowing: 


Word Type Phrase Type Sentence Type 
Teeth Brushed Teeth He brushes his teeth each morning. 
Hair Combed Hair He combs his hair each morning. 
Nose Cleaned Nose He has a clean nose each morning. 
Eyes Cleaned Eyes He has clean eyes each morning. 
Ears Washed Ears He washes his ears each morning. 
Face Washed Face He washes his face each morning. 


An illustration of a portion of a checklist containing a group of ideas, 
to be used in checking the answer to an essay question dealing with 
the probable results of unemployment may take the following form: 
Check whether the pupil has seen relationships between 


Directions: 
unemployment and the following: 


Effect on human beings 


— Materially — Psychologically 

— Socially — Emotionally 

— Politically 

Effect on social institutions 

— Industries — Church 

— Unions __ Political Parties 

— Schools — Press, radio, movies, TV 
—— Government 


The following checklist, which is to be completed by the student, 
includes items which involve a series of steps in finding out about a 
new people, in this instance, the Hindus: 


Directions: Check only the steps you have taken: 


__ Consulted Cumulative Book Index 
.  Consulted New York Times Index 
file on Hindus in school library 

Guide to Periodical Literature 

assy in Washington, D. C. 


— Consulted an encyclopedia 
— Consulted the dictionary 
— Referred to card 
— Consulted Readers 
. Wrote to Indian Emb 


Types of Rating Scales 


1. THE DESCRIPTIVE RATING SCALE 

The checklist may be converted into a descriptive rating scale with 
little difficulty by the addition of a scale for noting the degree to 
which a given aspect of behavior is in evidence. The following rating 


164 - Major Evaluation Techniques 


scale was drawn up by a scout leader to evaluate his group. The 

scale can be further quantified by assigning the values 8, 2, 1, and 0 

to "all the time," “most of the time," "occasionally,". and “never,” 

respectively. à 
DESCRIPTIVE RATING SCALE 


Name. Date. 


ALL THE | MOST OF OCCA- 


NEVER 
TIME - TIME|SIONALLY yE 


1. Waits his turn to talk. 
2. Talks to the point. 


3. Tries to see the other fellow’s 
point. 


4. Listens to suggestions of others. 


5. Offers suggestions to group. * 


6. Abides by majority decisions. 


7. Doesn’t get “sore” if his sugges- 
tions aren’t carried out. 


8. Takes his share of responsibility. 


9. Carries to completion the tasks 
he accepts. 


10. Doesn’t “play around,” but gets 
down to business. 


ART 


11. Can work without specific in- 


structions. 
12. Looks out for interest of group. 


13. Works well with most of the 
boys in the group. 4 
14. Is liked by most of the group.* 
15. Doesn't try to be “boss.” i 
16. Doesn’t feel that others are 
“picking” on him. 
17. Takes responsibility for some 
leadership. 
18. Accepts leadership of boys his 
age. 
19. Doesn't take petty group diffi- 
culties to leader. 
20. Considers leader's suggestions. 


* Many—Average No.—Few—None. 


Checklists and. Rating Scales 165 


2. THE GRAPHIC RATING SCALE 

The graphic rating scale represents a variation of this approach. 
The rater’s evaluation is indicated by placing a check or cross on a 
line to indicate presence or absence of a given trait. The instrument 
on page 166, constructed by Dr. H. C. Raye of Southern Illinois Uni- 
versity, and used in classes in stenography, is typical of the graphic 
rating scale. The scale values which are used to quantify ratings, 
instead of being restricted to 1, 2, 3, 4, 5, may also run from 100 to 0, 
thus embodying the familiar per cent concept. 


3. THE FORCED-CHOICE TECHNIQUE 

One of the major disadvantages of both the descriptive and the 
graphic rating scale is in the rater's ability to control the final result 
of his ratings. In recent years, the staff of the Personnel Research 
Section of the Adjutant General's office (8) has developed a forced- 
choice technique to offset this difficulty. In essence, the technique 
forces the rater to choose between paired alternatives, both of which 
raters in general are equally willing to use in describing a person. 
The two alternatives, however, differ in the extent to which they 
discriminate between two groups—for example, successful and un- 
successful supervisors. 

The examples given below 
experimentally in rating school super 
In each instance, the paired items are approxi 
ence" value, in that they are equally accepta 
used as raters. One of each pair, however, is t 
tion of the good or poor supervisor. 

A. An effective speaker 
B. Has maintained or built a fine parents association 
A. Has no sense of values: cannot distinguish the good from 


the bad 


B. His instructions are indefinite or hard to follow 
A. Conducts a continuous evaluation of teaching procedures 
B. Continues to be a student 


A. Seems neurotic 
B. Uses the school for the advancement of pet theories and 


are drawn from a scale now being used 
visors in a large city system. 
mately equal in “prefer- 
ble to persons who are . 
he better as a descrip- 


projects 


4. THE RANK ORDER METHOD 
One rating approach has attempted to eliminate the use of a pub- 
lished scale altogether. Rather, persons being rated are placed in 


166 Major Evaluation Techniques 


RATING SCALE OF FACTORS THAT DETERMINE 
SUCCESS IN LEARNING SHORTHAND 


I. Desire to Improve—Attitude 


1 2 3 4 5 
Evidences intense desire Usually shows normal de- Definite lack of interest 
& interest at all times. sire to gain skill. & motive. 

Il. Home Work 

1 2 3 4 5 
Usually exceeds assign- Normally well done & Often incomplete. Messy. 
ment. Neat. Always on prompt. Late. 
time. 

HI. Class Attendance 

1 2 8 4 5 
Always present and Few absences; justifiable. Frequently absent—many 
prompt. unexcused. 

IV. Materials and Supplies 
1 2 3 4 5 


Always has good pen and Usually supplied with sat- Bad pen; dull pencil. 
note paper. isfactory tools & supplies. Low grade paper. 


V. Relaxation 
1 2 3 4 5 


Always comfortably re- Usually holds pen lightly; Tense. “Pinches” pen. 
laxed. No tenseness. does not bear down. Muscles of hand, arm, 


and body are tight. 
VI. Vocabulary 


1 2 3 4 5 


High-frequency words au- Normal ability to write Even brief forms hazy. 
tomatic. Readily builds common and new words. No word-building ability. 
others. 

VIL. Word-Carrying Capacity 


1 2 3 4 5 
“Gets it" even though Usually writes okay when Always loses out when 
20-80 words behind. gap is not too large. dictator gets ahead. 

VIII. Penmanship 

1 2 3 4 5 
Legible and beautiful Normally readable & good Illegible — bad looking. 
Uniform size & slant. looking. Fluent motion. Tends to draw outlines. 

IX. Reading of Notes 
1 2 "E 4 5 


Always reads accurately, Usually able to read satis- Frequently unable to 
rapidly, with meaning. factorily. read. 


Checklists and. Rating Scales 167 


serial or rank order in accordance with the rater's judgment of the 
degree to which they possess the quality or attribute under considera- 
tion. Ratings made by a number of judges may then be combined 
into a composite or average rank. The statistical treatment of rank 
order data, which has been described by Guilford (2) and Thurstone 


(7), is relatively simple. 


5. THE PAIRED-COMPARISON METHOD 

The paired-comparison approach also obviates the use of a scale. 
In this approach, each individual in turn, is judged as better or 
worse than every other individual in the group. The results of these 
paired comparisons are then handled statistically to arrive at relative 
ratings. The statistical analysis which is required (8) is extremely 
time-consuming. When twenty individuals are compared, for example, 
a total of 190 judgments, which involve computing 190 proportions 
and looking up 190 scale values in a set of tables, must be made. 
Although it gives more reliable results than rank order or rating scale 
methods, even the labor-saving statistical techniques suggested by 
Guilford (2) do not seem to make the paired-comparison method 


effective for teacher use. 


Constructing and Using Checklists and Rating 
' Scales 


SOURCES OF MATERIAL 

The teacher who wishes to construct a checklist or rating scale for 
classroom use has several possible sources of material which can be 
utilized. As a first resource, there is the teacher himself. No matter 
what the level or subject matter, the teacher usually has some specific 
goals and sub-goals for particular attitudes, information, and skills 
in which he feels his students should grow. The teacher himself, 
therefore, can list the kinds of things he wants his students to achieve 
by any given time. The teacher, by utilizing his own resources, can 
develop a checklist very rapidly. 

While this approach may obviate the arguments and bickerings 
which sometimes accompany a cooperative enterprise, using just one's 
own ideas has serious shortcomings. First, the teacher may have some 
biases or pet prejudices of which he is unaware and which may creep 
into his checklist. Second, the work of one person may be narrower 
in scope than the outcome of a cooperative effort involving other 


168 Major Evaluation Techniques 


interested teachers and pupils. Third, when a cooperative approach 
is utilized in constructing a checklist or rating scale, those who help 
realize its purposes and values more readily. Consequently, other 
teachers and pupils will participate more willingly in its adminis- 
tration. 4 

In the light of the relative strengths and weaknesses of an indi- 
vidualistic approach, one might well make it a practice to use other 
teachers and pupils as a source of material for checklists and rating 
scales. Not only are a number of sound educational objectives pro- 
moted in such cooperative work, but there would also be the advan- 
tages of (1) a wider range of ideas and suggestions, (2) increased 
objectivity because of the corrective influences of others, (8) in- 
creased likelihood of the device applying to all the groups who are 
supposed to take it, (4) increased probability of getting valid re- 
sponses from pupils who accept and favor the device because they 
have had a hand in constructing it, (5) likelihood of more intelligent 
use of the results, and (6) less likelihood of errors in typography and 
English. 

A third source of items for the checklist is the expert or group of 
experts. When it is urgent to get a job done quickly and when money 
is available, the school can gain many suggestions from capable con- 
sultants. The expert may be a member of a liberal arts or teachers 
college in a local or nearby university. Sometimes, state supervisory 
personnel are available for such tasks. For more elaborate evalua- 
tions, a short workshop may be set up under public or private auspices. 

A fourth source of checklist items is the numerous types of "job 
analyses" which have been made on various reading, writing, arith- 
metic, and citizenship skills and appreciations. Information concern- 
ing such valuable analyses of various attitudes, information, and skills 
in all fields is contained in the yearbooks of the National Society for 
the Study of Education and publications of subject-matter specialists 
in mathematics, social studies, English, music, etc. In addition, stand- 
ard textbooks in the teaching of various subjects at both elementary 
and secondary levels are rich sources for checklist material. Research 
studies are also valuable sources of materials. The checklists reported 
in such studies may serve as the bases for modified devices which 
suit local purposes and local situations. 

Courses of study and curriculum bulletins constitute still another 
source of material Large city and state systems publish courses of 
study which can be purchased or examined in reference libraries of 
local universities. 


Checklists and Rating Scales 169 


COMMON ERRORS IN CONSTRUCTION 


Several faults generally appear in constructing checklists and rating 
scales. Among the more common faults are: unclear headings, use of 
too ‘many items, overlapping items, and inclusion of extreme items. 
In part, some of these faults arise from the fact that certain traits are 
more amenable to rating than others. Thus, Shen (4) found that it 
was possible to rate such traits as scholarship, leadership, and in- 
telligence more adequately than judicial sense and tact. The devel- 
opment of a rating scale or checklist, then, demands an adequate 
description of what is being rated in such terms that the person using 


the instrument has a clear picture of the behavioral characteristic 


involved. 

To some degree, such weakness may be minimized by preparing a 
manual of directions which contains specific details about the mean- 
ing of each quality or characteristic included on the instrument. It is 
important, too, to try out the instrument, using several raters and a 
few subjects, to check for possible weaknesses. Judges should be en- 
couraged to raise questions when any doubts arise. 

Another common errór which one finds on rating scales is the use 
of too many or too few units on the scale which is employed. The 
use of too many units or steps makes it difficult for the rater to dis- 
criminate between one unit and the next; too few steps make for 
Coarse ratings, since the rater is not given an opportunity to exercise 
discrimination to the extent to which he is capable. While no rule 
can be given for the exact number of steps which should. be included 
On a rating scale, in most instances the use of seven units will yield 


optimal reliability (6). 


ERRORS IN USE 

A very common error in using checklists and rating scales in school 
Situations grows out of the failure to employ enough judges, with con- 
sequent loss of reliability. While experts differ in their recommenda- 
tions concerning the number of persons whose judgments should be 
pooled, at least three independent ratings should be obtained in most 
instances in which human traits are being evaluated (8). 

Individual judges may also make errors which tend to reduce the 
reliability of checklists and rating scales. Frequently, personal bias. 
may result in ratings which are too high or too low, or in failure to 


check favorable or unfavorable items on a checklist. When scale val- 
ating scale, many judges feel constrained to 


ue: 
S are employed on a T e only the central values of the 


avoid extreme ratings and tend to us' 


Y 
170 Major Evaluation Techniques 


scale. The rater's general mental attitude or general impression of a 
person being rated also results in a systematic error. Rugg (3) has 
pointed out that this "halo" effect leads the rater to attribute the gen- 
eral impression or attitude to particular qualities. Ratings of chafacter 
and personality traits are most likely to be subject to this halo effect. 


Summary 


Checklists and rating scales are similar types of evaluative devices. 
The former is a list of words, phrases, sentences, or paragraphs follow- 
ing which an observer records a check to denote presence or absence 
of that which is being observed. The rating scale is a list of qualities 
set down on a continuum upon which a rater indicates a value or 
rating. There are various kinds of checklists and rating scales which 
may be used to evaluate both technique and product. Using these 
devices may: (1) promote good teaching, (2) aid in curriculum plan- 
ning, and (3) improve school administration. 

Sources of items for checklists and rating scales may be found 
through self-analysis, by consultation with others, by studying job 
analyses, and by examining texts, courses of study, curriculum ‘bul- 
letins, and standardized scales and checklists. Common faults to avoid 
in constructing checklists and rating scales are unclear headings, use 
of too many items, overlapping items, inclusion of extreme items, and 
use of too many or too few steps on the scale which is employed. 
Common errors in the use of checklists and rating scales grow out of 
failure to employ enough raters, personal bias, failure to use extremes 
of the scale, and halo effect. 


Problems for Class Discussion 


1. (a) Construct a checklist designed to elicit the favorite television pro- 
grams of a group of upper-elementary-school children. Arrange for 
the administration of the checklist to a class. 

(b) Convert the checklist into a rating scale by the inclusion of an ap- 
propriate rating device, and administer this scale to the same group- 

(c) To what extent do similarities or differences appear when the results 
are compared? 

2. Construct a sequence checklist which would enable you to evaluate the 
technique and product of a vocational-school pupil called upon to (a) 
change a tire, or (b) give a finger wave, or (c) sew a French seam. 

8. Construct a rating scale designed to provide a measure of the effective- 
ness of a teacher. Ask a number of pupils in a high school to apply the 
scale to their teacher. To what extent do the pupils agree in their ratings? 
To what factors may differences in ratings be attributed? 


Checklists and Rating Scales 171 


References Cited in This Chapter 


1. Cooperative Study of Secondary-School Standards, Evaluative Criteria. 
Washington, D. C.: Cooperative Study of Secondary-School Standards, 
1950. 

2. Guilford, J. P., Psychometric Methods. New York: McGraw-Hill, 1936. 

8. Rugg, Harold O., “Is the Rating of Human Character Possible?” Journal 
of Educational Psychology, 12:425—438, 485-501, 1921; 18:30-42, 1922. 

4, Shen, Eugene, “The Reliability Coefficients of Personal Ratings,” Journal 
of Educational Psychology, 16:232-236, April, 1925. 

5. Staff of Personnel Research Section, Adjutant General’s Office, “The 
Forced Choice Technique and Rating Scales,” American Psychologist, 
1:267, 1946. 

6. Symonds, Percival M., Diagnosing Personality and Conduct. New York: 
Century Company, 1931. 

7. Thurstone, L. L., “Rank Order as a Psycho-Physical Method,” Journal of 
Experimental Psychology, 14:187-201, 1931. 

8. Thurstone, L. L., “The Method of Paired Comparisons for Social Values,” 
Journal of Abnormal and Social Psychology, 21:384—400, January-March, 
1927. 

9. Tyler, Ralph W., *A Test of Skill in Using a Microscope," Educational 
Research Bulletin, Ohio State University, 9:494, November 19, 1930. 


References for Further Reading 


Symonds, Percival M., Diagnosing Personality and Conduct. New York: 
Century Co., 1981. 

Chapter III presents an excellent summary of various rating methods 
and their values and limitations. 

Torgerson, Theodore L., Studying Children. New York: Dryden Press, 1947. 

This little volume contains a number of rating scales which are useful 
in studying the behavior of children, as well as behavior descriptions 
which are helpful in constructing checklists. 

Cunningham, Ruth, et al., Understanding Group Behavior of Boys and Girls. 
New York: Bureau of Publications, Teachers College, Columbia Univ., 
1951. . 

Chapter XI describes the use of checklists and rating scales developed 
for use in a comprehensive study of the group behavior of children. 
Samples of the instruments used are given in the Appendix. 


` 


Personal Reports and 
CHAPTER TEN 


Projective Techniques 


Probably no other psychological characteristics are so 
difficult to measure as the intangible and complex qualities involved in 
personality. In recent decades an increasing variety of personal re- 
ports and projective techniques have been devised to aid in the anal- 
ysis and evaluation of personal and social characteristics. The at- 
tempts to devise more adequate measures have provided a variety 
of methods, from self-descriptive tests with' standardized questions 
and problem checklists to tests or techniques in which the individual 
projects freely his responses to standardized ink blots or pictures. In 
addition to these techniques, other methods include analysis of hand- 
writing, analysis of paintings or drawings, completion of unfinished 
sentences or stories, dramatic play, and situational tests using toys Or 
puppets. The experimentation and research in this field continues un- 
abated. Significant progress in the evaluation of personality has been 
achieved, but many problems still require comprehensive and rigor- 
ous study before satisfactory solutions will be achieved. 


Scope and. Nature of Personal Reports 
and. Projective Techniques 


Personal reports: are instruments of evaluation which rely upon the 
individual to be evaluated for opinions regarding his own behavior, 
feelings, and traits. In personality tests, inventories, problem check- 
lists, and autobiography, the individual describes or rates himself and 
reports his likes, dislikes, conflicts, and problems. Personal reports 
usually consist of verbal responses to a list of standardized questions. 

Projective techniques are intended to go beyond testimony and self- 
diagnosis by the individual. They make possible a qualitative, compre- 
hensive, and intensive study of the personality. Projective methods in- 
clude all the devices—ink blots, cloud pictures, thematic pictures, play 

172 


Personal Reports and Projective Techniques 173 


* 
materials—that enable the subject to project himself into a situation 
outside of himself. Through reacting to unstructured stimuli, the sub- 
ject supposedly expresses his own way of organizing experience. He 
expresses his way of seeing life, its significance and patterns, and his 
own meaning and feelings (6). As Murphy (10) says: "He sees in it 
(the stimulus situation of the ink blot, the picture) what he personally 
is disposed to see or does with it what he is personally disposed to 
do." The mechanism of projection implies that the subject is unaware 
of revealing himself in the handling of the stimulus material. 

Personal reports, with their emphasis on self-description, play an 
important role in evaluation. They can be administered and inter- 
preted without great difficulty, and they provide important clues for 
understanding the individual and his problems. Projective techniques 
hold the promise of providing deeper insights into those areas of per- 
sonality which the individual himself is unwilling to reveal directly, 
or of which he may not even be conscious. Difficult to administer and 
interpret, however, they are valuable only in the hands of the spe- 


cialist. 


OBJECTIVES OR ASPECTS OF BEHAVIOR 
MEASURED BY THESE TECHNIQUES 

Personality tests, self-descriptive inventories, problem checklists, 
and other personal report methods evaluate specific aspects of per- 
sonality, whereas projective techniques approach personality structure 
in a more comprehensive manner. Hunt (7) says that personal reports 


may provide an index of personality traits or components, such as 


emotional adjustment, self-control, social initiative, self-sufficiency, 


self-determination, self-esteem, ascendance-submission, dominance- 
submission, cheerfulness-depression, introversion-extroversion, social 
introversion, depression-elation, neurotic tendency, personal inferi- 
ority, social inferiority, emotional maturity, and other aspects of per- 
sonality and behavior. Frank (6) says that projective techniques were 
developed to reveal a global picture of an individual's personality or- 
ganization and to provide insight into his private world of feelings 
and meanings. The advocates of these techniques feel that they make 
possible evaluation of various forms of mental conflict, disturbances, 


sentiments, and feelings. 


DEGREE OF EXPERTNESS REQUIRED FOR 
TERPRETATION 


ADMINISTRATION AND IN 
Personality tests and inventories, like other methods involving self- 
reporting, are administered without serious difficulty by the counselor 


174 Major Evaluation Techniques 


or classroom teacher. Most of the tests are accompanied by manuals 
which explain administration and scoring of the test. The classroom 
teacher should have no difficulty in carrying out instructions. Teach- 
ers should attempt interpretation of test results only if they have com- 
pleted some study in the field of mental hygiene and tests or measure- 
ment. It is advisable to interpret test scores jointly with other teach- 
ers and the guidance worker, combining the scores with other data 
about the individual's behavior and conduct. 

Projective techniques vary in the degree of expertness required for 
their administration and interpretation. At one extreme the Rorschach 
test, and to a lesser extent the Thematic Apperception Test, require 
specialized skill for administration and interpretation. Techniques 
utilizing drawing, painting, and play are simpler to work with and 
present less difficulty in the area of interpretation, but yield corre- 
spondingly less revealing analyses. All methods require some special 
training in analysis and interpretation. 


CAUTIONS ABOUT VALIDITY AND RELIABILITY 


Both personal reports and projective techniques present problems 
of validity and reliability. The validity of the self-report instrument 
seems to be related to the kind of questions or items included. Ob- 
jective information about the physical conditions of the home, for 
instance, appear to be reported accurately. For more personal items 
there is the tendency to give the “best” answer. Individuals are under 
social pressure and often feel compelled by their own personal experi- 
ences to conceal what they actually think or believe or feel. Responses 
on self-report tests may be different when the individuals are asked 
to sign their names to the tests and when they are not. Studies com- 
paring the rating of judges on certain personality traits with test re- 
sults do not show satisfactory correspondence, according to Traxler 
(18). 

Though numerous studies have attempted to "validate" projective 
techniques, many questions are still left unanswered. The studies have 
attempted to show, for instance, that diagnosis of a mental disorder 
based on a Rorschach test will correlate highly with the diagnosis 
given by a psychiatrist on the basis of the clinical history and ob- 
servation of the patient. Caution is nevertheless indicated, particularly 
in the use of the Thematic Apperception Test, drawing and painting, 
play, and other techniques. Though these techniques provide valuable 
clues to the nature of the personality structure and problems involved. 
the interpretation of material obtained through them must still be 


Personal Reports and Projective Techniques 175 


considered an hypothesis to be followed up by other techniques and 
by observation. 


Personality Tests or Inventories 


A personality inventory is any collection of questions or statements 
designed to yield data on an individual’s social and emotional adjust- 
ment, The inventory or test usually consists of a large number of 
questions and statements which the subject may answer with “yes” 
or “no,” or with which the subject may agree or disagree on various 
levels of intensity. The individual rates or describes himself within a 
setting which has been compared to that of a "standardized inter- 
view." 

Many of the personality inventories are based on the trait theory 
of personality, which conceives of personality as composed of traits, 
or tendencies to react in a consistent way in response to a defined 
class of stimuli (4). Thus, personality inventories have purported to 
measure, among other factors, self-control, self-sufficiency, ascend- 
ance-submission, dominance-submission, neurotic tendency, social in- 
feriority, social introversion, emotional maturity, social adjustment, 
and health adjustment. The tests measure the individual's conform- 
ance to or deviation from group norms, which can be established for 
age, sex, and other group characteristics. The test scores of an indi- 
vidual then automatically classify him in relation to the group. 

Personal inventories are an efficient and economical method of eval- 
uation (12) which can be handled by most teachers and which may 
provide considerable insight. Flexible in nature, they can be devel- 
Oped to cover every aspect of an individual's traits, values, needs, atti- 
tudes, and experiences. Their limitation, as mentioned, derives from 
the assumption that an individual knows and is willing to reveal im- 
portant aspects of his personality. Moreover, the attempt to divide per- 
sonality into separate entities is open to question. : 

Table 2 lists a few of the personality inventories which have been 
employed in the evaluation of specific aspects of social and emotional 


adjustment for certain age groups "AA 
A list of sample items from a self-descriptive inventory that was 


devised and used by one of the authors in a large city school system 

is illustrative of the items generally used in personality inventories. 
Are you troubled at night by dreams 

about your work «555000777 


176 


Illustrative 
TABLE 2 


TEST AND PUBLISHER 


Aspects of Personality 
(World Book Company) 


Behavior Preference 
Record © 
(California Test Bureau) 


Bell Adjustment 
Inventory 


(Stanford University Press) 


Bernreuter Personality 
Inventory 


(Stanford University Press) 


California Test of 
Personality 
(California Test Bureau) 


- Cowan Adolescent 
Adjustment Analyzer 
(Cowan Research Project, 
Salina, Kansas) 


Gordon Personal Profile 
(World Book Company) 


Heston Personal 
Adjustment Inventory 
(World Book Company) 


Johnson Temperament 
Analysis 
(California Test Bureau) 


Mental Health Analysis 
(California Test Bureau) 


DATE 


1939 


1953 


1933 


1933 


1950 


1946 


1953 


1949 


1945 


1946 


Major Evaluation Techniques 


Personality Tests and Inventories 


ASPECTS MEASURED 
Ascendance-submission, 
extroversion-introversion, 
emotionality 


Cooperation, friendliness, 
integrity, leadership, re- 
sponsibility 


Home, health, social, and 
emotional adjustment 


Neurotic tendency, self- 
sufficiency, introversion- 
extroversion, dominance- 
submission, sociability, 
confidence 4 


Self-adjustment, 
social adjustment 


Adjustment to fears, fam- 
ily, emotions, maturity 


Ascendancy, responsibil- 
ity, emotional stability, 
sociability 

Analytical thinking, 
sociability, 

emotional stability, 
confidence, 

personal relations, 

home satisfaction 


Nine behavior patterns: 
composed-nervous, 
quiet-active, 
objective-subjective, etc. 
Mental health liabilities 
and assets: feelings of in- 
adequacy, etc; social 
participation, etc. 


9-16, 
Adults 


Grades 
9-16, 
Adults 


Grades 
Kg-3, 4-8, 
7-10, 9-16 
Ages 

12-18 


Grades 
9-16, 
Adults 
Grades 
9-16, 
Adults 


Adults 


E 


.€ach answer, and a score i 


Personal Reports and Projective Techniques 177 

TEST AND PUBLISHER DATE ASPECTS MEASURED LEVEL 
Minnesota Multiphasic 1951 Depression, hysteria, so- Ages 16 
Inventory cial introversion, etc. and over 
(Psychological Corp.) 
Minnesota Personality 1941 Morale, social adjustment, Grades 
Scale family relations, emotion- 11-16 
(Psychological Corp.) ality, economic conserva- 

tism 

Personal Audit 1945 Nine areas: seriousness, Grades 
(Science Research stability, tolerance, per- 9-16, 
Associates) sistence, etc. Adults 
Thurstone Temperament 1950 Seven traits: impulsive, Grades 
Schedule dominant, stable, social, 9-16, 
(Science Research etc. Adults 
Associates) - 
Washburne Social 1940 Truthfulness, happiness, Ages 12 
Adjustment Inventory alienation, sympathy, and over 


(World Book Company) purpose, etc. 
——— A ——N sn MÓMÀP——— —— e -— 


Do you sometimes see spots "swim- 
ming” before your eyes ..... Ye ( ) No( ) 
Do you lose your temper and get angry 
4 " Ye ( ) No( ) 


easily erp mee ete oed í 
Do you like to see others in pain .... Ye ( ) No (^) 


Do you find it more pleasant to live in a 
“make believe" world than in the real 


NWOT: o sarcccyey SMa cts eae SEEN DA 


Yes (e) Non) 


Interpretation 
Manuals with instructions for scoring and interpreting personality 
tests and inventories usually accompany test forms. In some tests, a 


Specific numerical value, either positive or negative, is assigned to 
s obtained by adding the numbers for the 


parts, or for the test as a whole. Scoring becomes more difficult when 
the instructions demand statistical manipulations and correlations be- 
tween parts. Test results can be tabulated by hand or by machine. 
The score made by the individual on various parts of the test is com- 


pared to the norm established by previous testing. ; 

Many psychologists question whether personality inventories and 
tests measure what they are designed to measure. Independent rating 
by judges of the personality characteristics measured by these tests 
often show a low correlation with the test results. Nevertheless, most 


178 ^ Major Evaluation Techniques 


/ 
authorities agree that self-descriptive scales are valuable methods of 
appraisal which can materially contribute to the guidance program of 
a school. Traxler (18) feels that data provided by the tests fulfill the 
following two important functions: (a) they stimulate the pupils to 
evaluate critically their own personality characteristics, and (b) they 
serve as a point of departure in conferences between counselor and 
individual pupils. He further states that inventories are helpful in lo- 
cating pupils who are “poorly adjusted and who need guidance in 
making emotional and personal adjustments.” The data from the tests 
become part of the cumulative record of the pupil. The teacher is ad- 
vised to do some reading and study in the field of personality and 
mental hygiene before attempting to use and to interpret personality 
tests of this kind. 


Problem Checklists 


Problem checklists are self-reporting questionnaires which deal 
with problems rather than with personality characteristics or interests. 
The checklists, such as those developed by Mooney and associates at 
Ohio State University, consist of phrases intended to make it easy for 
pupils to express their troublesome problems. The Mooney checklist 
for high-school students covers these problem areas: (a) health and 
physical development, (b) finances, living conditions, and employ- 
ment, (c) social and recreational activities, (d) social-psychological 
relations, (e) personal-psychological relations, (f) courtship, sex, an 
marriage, (g) home and family, (h) morals and religion, (i) adjust- 
ment to school work, (j) the future: vocational and educational, and 
(k) curriculum and teaching procedures. Checklists for junior-high- 
school students and adults cover more or less the same problem areas: 

The procedure for using checklists is the following: The student 
underlines problems of concern to him, circles those of most vital con- 
cern, and answers a few summarizing questions which deal with his 
feelings about his problems. : 

The checklists are not tests. They are intended to provide an indi- 
cation of behavior problems that occur in an individual or a group 
of individuals. The lists give students an opportunity to express their 
problems and provide assistance in understanding the problems ex- 
pressed. The lists are not diagnostic instruments but are descriptive 
and analytical. Lentz (3) writes that he knows of no alternative Or 
comparable instrument designed for the same purposes. Many teach- 
ers have commented that the lists have been helpful. 


Personal Reports and Projective Techniques 179 


Some of the items from the Mooney Problem Check List, High 
School Form, are: 


(Instructions: Read the list slowly, and as you come to a problem that 
troubles you, underline it) 


1. Being underweight 
5. Frequent illnesses 
10. Wanting to earn some of my own money 
15. Wanting to learn how to entertain 
20. Uninterested in the opposite sex 
95. Getting rid of people I don't like 


Recently several problem checklists have been published. These 
include the SRA Junior Inventory, SRA Youth Inventory, and Life 
Adjustment Inventory. All problem checklists follow the same gen- 
eral format of presenting a list of problems, but the specific content 
and areas of problems vary among the checklists. 

The SRA Junior Inventory and the SRA Youth Inventory are dis- 
tributed by the Science Research Associates, Chicago, Illinois. The 
Junior Inventory is concerned with problems of boys and girls in the 
elementary school. Five major areas of problems are presented under 
these headings: (a) My Health, (b) Getting Along with Other People, 
(c) About Myself, (d) About Me and My School, and (e) About Me 
and My Home. Most of the items in the Junior Inventory were se- 
lected from a content analysis of essays written by hundreds of ele- 
mentary-school children on their problems. Additional items were ob- 
tained from teachers, guidance counselors, and pediatricians. The SRA 
Youth Inventory is designed to help identify problems of teen-agers 
and may be used in grades 7 to 12. This inventory consists of eight 
major areas of problems indicated by such headings as (a) My School, 
(b) Looking Ahead, (c) About Myself, (d) Getting Along with 
Others, (e) My Home and Family, (f) Boy Meets Girl, (g) Health, 
and (h) Things in General. The SRA Inventory is self-scoring and 
permits an easy identification of problem areas through the use of a 
self-interpreting chart filled out by the pupil himself. 

The Life Adjustment Inyentory, distributed by the Acorn Publish- 
ing Company, Rockville Centre, N. Y., has been designed to help 
identify, through pupil responses to problems, improvements or 
changes in the secondary-school curriculum that will better meet 
pupil needs. This checklist conforms, in general, with the United 
States Office of Education's Life Adjustment Program. The inventory 
contains a list of problems that are organized under such headings as 


xs 


180 Major Evaluation Techniques 


(a) general feeling of adjustment to the curriculum, (b) reading and 
study skills, (c) general social skills and etiquette, (d) boy-girl rela- 
tionships, (e) religion, morals, and ethics, (f) vocational preparation, 
(g) physical and mental health, (h) family living, (i) consumer edu- 
cation, and (j) use of leisure time. 


Interpretation 


In interpreting the checklist it is important to consider some of the 
following points listed by Mooney (9). Items marked should be con- 
sidered as symbols of the experiences and situations which comprise 
the individual's problem world. The item or problem checked should 
not be mistaken for the problem itself. Two students may mark the 
same problem or identical pattern of problems, and yet the problem 
world of the two might not be identical. Problems marked are not of 
equal significance: one item may prove to be more indicative of a 
substantial blockage in the life of an individual than a dozen others 
which he might mark. 

Mooney outlines five ways in which problem checklists may be 
used: (a) to make group surveys, which includes finding out what 
youth are thinking about in their personal lives and the location of 
students who want and need counseling or other personal aid, (b) to 
provide a basis for group guidance, orientation, and personnel pro- 
grams, (c) to increase teacher understanding in regular classroom 
teaching, (d) to facilitate guidance interviews, and (e) to conduct. 
research in the problems of youth. 


Autobiography 


An autobiography is an individual's life story written by himself. It 
is an instrument of self-description which furnishes information fre- 
quently not obtainable by group or individual tests. Despite limitations, 
an autobiography or a composition of the autobiographical type may 
provide an understanding of the student's interests, abilities, personal 
history, hopes, ambitions, and desires (4). The autobiography often 
serves as a basis for an interview between counselor and student. The 
assignment to write an autobiography may be given to the pupil by the 
counselor, the home room teacher, or the English teacher. Autobiogra- 
phies may take a number of forms, ranging from short themes to 
lengthy narrative accounts. It will serve the purpose of guiding the pu- 
pil’s writing if the teacher will present him with a list of things to 
be included, such as early life experiences, family background and 


Personal Reports and Projective Techniques 181 
history, health and physical record, school history, interests, leisure 
time activities, hobbies, travel experiences, friendships, occupational 
experiences, educational plans for the future, long-time vocational 
plans, and desires and plans for marriage and home life. 
Compositions with elements of autobiography in them may be re- 
vealing because of the spontaneous introduction of p 
Strang (15:82) cites such a short composition of the autobiographical 
gh-school girl on the subject “When I Felt 
t shows how much can 


ersonal material. 


type written by a senior-hi 
at a Loss," A brief quotation from this accoun 
€ revealed within a short space. 

"At parties or any kind of dancing situation, when I am without a 
partner I feel ill at ease, at a loss as to what to do. I feel as though 
everyone is saying to themselves, ‘Oh there's Sally without a 
partner,’ in a rather unkind way. I want to cover up and try to look 
as though I weren't.a wall flower, as though I had other things to 
do besides dance. I don’t feel sure of the affections of people out- 
side of my family. I wonder if they really care for me as they say 
they do. . . . This happens only with people I care very much 
for. I want their affection so much I worry about it. It does not 


bother me with some people. . - - 


Interpretation 


The value of the autobiograph 
Which it is obtained. For the au 
must be ready to write and must u 
Must feel sure that their confidence 
who obtains the personal document mus 


mutua] tru ect. 

The Hc s gives clues about the personality structure of 
the individual and about his modes of thinking and feeling. For con- 

adiction or confirmation, the analysis of the pupils writing may be 
checked against observational and test data. Strang (15) points out 

at with reticent or resistive children the autobiography 78 Gin Gaps 
Cially-valuable source of understanding which gives indication of spe- 
cial interests and talents and even of emotional difficulties not re- 
ve 

aled by other means. bout the validity of the auto- 


t is important to remain cautious 
UAM cae an instrument of evaluation. Items may be exaggerated 


T minimized for reasons unknown to the teacher. Sometimes Vds 
Jelpfu] to verify a pupil's responses by making a home oi VE py 
including in an interview casual questions about doubtful items in 


e autobiography. 


y is related to the conditions under 
tobiography to have value, pupils 
nderstand what is involved. They 
will not be violated. The person 
t maintain a relationship of 


O; 


182 Major Evaluation Techniques 


The Rorschach Test 


The use of the Rorschach test as an instrument of evaluation is 
limited to psychologists, psychiatrists, and Rorschach specialists. Train- 
ing and skill is needed to administer, score, and interpret Rorschach 
records. However, teachers may refer a child to a psychologist for 
the administration of a Rorschach, and interpretations of personality 
based on Rorschach records may be discussed in case conferences 
which involve teacher, principal, and guidance worker as a team. It 
may, therefore, be useful to define the technique, briefly illustrate its 
use, and explain some of the principles of interpretation developed by 
persons who have used it. 

The Rorschach is a "diagnostic test based upon perception" (13). 
The test derives its name from a Swiss psychiatrist, Hermann Ror- 
schach. Rorschach was interested in developing a diagnostic tool 
which would help the psychiatrist differentiate individuals according 
to patterns of mental characteristics. Experimenting with ink blots, he 
found that a great deal about an individual's personality organization 
is revealed by the manner in which he sees and interprets these blots. 
As finally developed, the test consists of ten cards each of which pre- 
sents an ink blot, a symmetrical figure, or shape, with different shades 
of black on a white background, and with colors in some of the cards. 
The shapes or figures on the cards are relatively unstructured. They 
can be seen and interpreted in many ways. Each individual who gives 
an interpretation of what the blots represent must give form and 
meaning to what the card offers. 


ped Re 3 ANA 
FicurE 9 Ink Blot Similar to Those Used in the Rorschach Test 


Personal Reports and Projective Techniques 183 


The instructions given to someone taking the test are these: 


4 
People see all sorts of things in these ink blot pictures; now tell 
P d you see, what it might be for you, what it makes you think 
of. 


The psychologist is concerned with three aspects of the subject's 
responses. First, content—what did he see? Did he see human beings, : 
objects, etc.? Second, location—which part of the blot did the subject 
utilize for his response? Did he use the whole card, a part of it, or a 
small detail only? Third, determinants—what in the card seems to have 
determined the response? What elements in the blot led the subject 
to see what he did? Was it the color, form, shading, movement, or a 
combination of these? Every response is scored for all three aspects. 

The case of Sylvia L., reported by Krugman (8) will illustrate the 
use of the Rorschach in the area of child guidance. It shows how 
Rorschach records can be used to evaluate behavior and personality 


problems. 
This case required the administration of the Rorschach before and. 


after psychiatric treatment to see whether treatment was sufficient. 


“Sylvia was 9 years 7 months old when referred for study two years 
ago by a cooperating family agency. The problem, as stated then, 
was ‘school retardation, enuresis, doing things dangerous to herself, 
migraine headaches, night terrors, tells fantastic stories, makes 
faces at herself in mirror, peculiar behavior.’ At that time, need for 
psychiatric treatment was evident, but the psychiatrist’s schedule 
did not permit of her being taken on and she was placed on a wait- 
ing list. In the meantime, the social worker from the cooperating 


agency continued to work intensively with the child and the foster 
s in handling and behavior, but some 


parents with excellent result: som 
neurotic symptoms still persisted. Four months ago the psychiatrist 


took the child on for treatment and has worked with her regularly 
since, Recently neurotic symptoms seem to have cleared, and the 
psychiatrist administered another Rorschach examination, giving it 
to the writer for interpretation and comparison with the original 
Rorschach to see if treatment could be terminated. 


“The original Rorschach showed a tense, anxious, repressed, 
neurotic child, with much violence breaking through only in the 
inquiry and not in the examination itself. The reexamination 
showed exactly twice as many responses as the first (22 as against 
11), and in general, showed a lifting of repression, with form less 
rigid, with color coming out in the examination rather than in the 
tent, an elimination of shock reactions. Ex- 


inquiry, much freer con 
amples of actual responses are even more illuminating than the 


summarized protocol.” 


184 


SYLVIA L. 1ST ADMINISTRATION 


Major Evaluation Techniques 


2ND ADMINISTRATION 


Card I Looks like a lady, but she has 


no head. 


(Much turning—can't see 
nothin'—) 
2 dead bears. 


Eskimo lady—Her head is 
chopped off. 


(Quick) These 2 are bears— 
their heads are chopped off 
—this is blood—their feet is 
chopped off. 


= 


Card III | 2 skeletons. 


(Disturbance) a man. 


Up here a horse, and I don’t 


know what this is (bottom). 


Card VI 


These 2 are ladies—their 
back is broken. This is a fire 
—they’re putting something 
on the stove. 


(No disturbance) Giant’s 
feet—this is his hands and 
this is his tail—it’s all a 
giant. 


This is an animal’s skin, and 
this is another one. 


Card VIII 


This looks like a bear—that’s 
all I could see—there’s some 
others but I forgot them— 
No—that's all I could re- 
member now. 


These are 2 bears—this is a 
flame—these are bodies— 
the 2 bears are putting the 
bodies on the fire to burn. 


Card IX  |2 Reindeer— (laughs) I don't 


know. 


2 animals are pulling the 
reindeers into the fire. 


Card X That could be a rabbit—This 
could be a cat—and that’s 


all I see. 


This is a bear peeking out 
thru his hole. These are peo- 
ple. These are boys. These 
are people’s bodies. 


“Changes in the second examination are not due merely to maturation; 
re-examination of many other children who have not been treated do 


not yield such characteristic changes. 


“The conclusion, submitted to the psychiatrist after interpreting the 
second Rorschach and comparing it with the first, was: ‘The re-exami- 
nation shows a more productive and much less inhibited girl, probably 
living less in fantasy, much freer in expression, less disturbed neuroti- 
cally, living out her fantasy, probably more annoying to people about 
her, but less disturbed intrapsychically.’ On the’basis of his knowledge 
of the case and results of the Rorschach re-examination, the psychiatrist 
decided to prepare the child for termination of treatment.” 


Personal Reports and Projective Techniques 185 


Interpretation 


Though the interpretation of a Rorschach record requires consider- 
able skill, and should only be made on the basis of all the responses 
to the ten cards, some general statements have been made as to the 
meaning of certain kinds of responses. A thorough analysis requires 
calculations involving all responses, each scored three ways. It is ad- 
visable to remain cautious as to the validity of some of the principles 
of interpretation to be listed. 


Color Responses (C) 

Individuals with an outgoing emotional disposition tend to give 
many color responses. This means that for them the color element in 
the blot will determine how the blot looks. One Rorschach specialist 
(18) has stated that “color responses always indicate a desire for an 
exchange of pleasure or pain with some other person.” Whether spe- 
cific personality characteristics can be inferred from one type of re- 
sponse is disputed by other workers in this area. There is still substan- 
tial disagreement among Rorschach psychologists as to the meaning 
of some responses. 


Whole (W) and Large Detail (D) Responses 

If a subject utilizes the whole blot for his interpretation in a number 
of cards, it may indicate a tendency to solve a problem in a compre- 
hensive and complete way. This would be “associated with intellectual 
ambition, interest in systematically organized ideas and theoretical 
matters” (4). Using a large part of the blot, but not the whole, has 
been considered a “common sense” attack, useful in practical affairs. 
W and D responses refer to that part of the blot utilized for the re- 
sponse, while the above-mentioned C responses refer to the elements 
in the blot which determined the response. 


Movement (M) 

Many people see the figures on the blots as moving and interacting. . 
They see some parts of the ink blot in the process of changing their 
relative position to each other. Individuals with a rich, inner life typ- 


ically have high M. (movement) scores. According to Rorschach, the 


movement response indicates an ability to create new, personalized 
constructions, a capacity for “inner creation,” for living more within 


Oneself than in the outer world. 


Form (F) 


If a response is solely determined by the outline or shape of the 


186 Major Evaluation Techniques 


blot, and not by its color, shading, etc., it is called an F (form) re- 
sponse. Persons who produce F responses to the exclusion of color or 
movement are said to be colorless and lacking in unique personality. 


THE GROUP RORSCHACH 


One recent development of general interest in Rorschach testing is 
the use of the Group Rorschach. This type is administered to a group 
of individuals by means of slides projected onto a screen, and their 
responses are written down by the subjects. It can be administered 
rather easily, but interpretation of responses, of course, needs the 
skill of the trained specialist. The Group Rorschach cannot be used 
with very young children but has been used with older children, ado- 
lescents, and adults. This adaptation of the method, as well as a mul- 
tiple-choice Rorschach (in which the individual chooses from many 
responses the ones with which he agrees), has limitations, but has 
been effectively used as a screening device. 


Thematic Apperception and Other Picture 
Projective Tests 


Among the pictorial projective techniques, the greatest amount of 
research has been devoted to the Thematic Apperception Test. This 
test, developed at the Harvard Psychological Clinic by Henry A. 
Murray (11) and his co-workers, consists of a series of ambiguous 
pictures portraying one or more individuals. The subject is shown a 
picture and asked to write or tell a story about the person in that 
picture. The instructions given to the subject usually are a variation 
of the following: "This is a test of creative imagination. I am going 
to show you some pictures. Around each picture I want you to com- 
pose a story. Outline the incidents which have led up to the situation 
depicted in the story. Describe what is occurring at the moment—the 
feeling and thought of the characters, their relationship to each other. 
And tell what the outcome will be" (11). 

The individual is thus made to believe that his fantasy or creative 
imagination is being tested and does not realize that he is revealing 
his dominant drives, emotions, sentiments, complexes, and conflicts. 
Since the pictures are relatively unstructured—can be interpreted in 
several ways—the kind of interpretation which an individual gives is 
assumed to express his past experiences, conflicts, and wishes.' The 
hypothesis is that the individual will identify himself with one of the 
figures portrayed in the scene and project his own conflicts upon that 


Personal Reports and Projective Techniques 187 


figure. Bell (2) has defined the Thematic Apperception Test as a 
method for the stimulation, recording, and analysis of fantasy." 

The problem in working with the TAT lies in verifying the analysis 
of interpretations. Granted that a specific fantasy about the characters 
in the picture is an indication of the personality of the individual who 
composed it, what assurance do we have that the analysis which we 
offer is correct? 

Following are descriptions of some of the pictures: 


l. A young boy is contemplating a violin which rests on a 
table in front of him. 
2. Country scene: In the foreground is a young woman with 


books in her hand; in the background a man is working in 
the fields and an older woman is looking on. 


4. A woman is clutching the shoulders of a man whose face 
and body are averted as if he were trying to pull away from 
her. 


6BM A short elderly 
woman stands 
with her back 
turned to a tall 
young man. The 
latter is looking ` 
downward with a 
perplexed expres- 
sion. 

8BM An adolescent 
boy looks straight 
out of the picture. 
The barrel of a 
rifle is visible on 
one side, and in 
the background 
is the dim scene 
of a surgical op- 
eration, like a 
reverie image. 


ricure 4 A Picture from the Thematic 
Apperception Test. 


Two stories from Tomkins (17:115, 118) will illustrate the kind of 
Stories composed, and the ways in which the psychologist makes use 
of the story material for diagnosis. Tomkins relates them to show how 
the TAT can help the psychologist determine, among other things, the 
strength of the parental impact within the family setting upon the 


m Henry A. Murray, Thematic Apperception Test. 


* Reprinted by permission, fro 
y Press, Copyright, 1943, by The President 


Cambridge, Mass.: Harvard Universit 
and Fellows of Harvard College. 


188 Major Evaluation Techniques 


child. Both: fantasies have been given about picture number one 
(young boy contemplating a violin which rests on a table in front of 
him). 


"Sitting there is a boy about 10 years old gazing at a violin. Ah— 
he's quite an intelligent fellow and ah—I think perhaps he wanted 
to study the violin. So his mother and father bought him the violin 
and he started to take lessons. As the lessons progressed, he found 
there was more work involved than he had thought. He realizes 
that success wouldn't come as easily as he thought, and he will 
have to study and study before he becomes a great violinist. By 
the expression in his eyes, it looks as if he's wondering whether 
it’s worth all the effort. As he grows older, he stops taking violin 
lessons and listens to concerts. He will realize he was wrong to stop 
taking violin lessons." 


In his interpretation Tomkins is concerned only with the clues this 
story provides for gauging parental impact on the individual. He con- 
cludes that for the individual who developed the above fantasy, pa- 
rental impact is limited. In the story, according to Tomkins’ analysis, 
the parents are capable of doing what the child wishes, but the child 
is the instigator of their activity and they merely minister to his needs. 

The next fantasy, also stimulated by picture number one, presents 
a different interpretation. 


"Many years ago a boy got a present for his birthday. The reason 
that they gave him a violin was because they wanted him to become 
as interested in music as they were. As the little boy sat looking 
at the beautiful instrument disappointment showed in his face. He 
thought how much nicer it would have been if his parents had 
given him some toys. He realized how much the present meant to 
them so he tried to look as happy as possible. He was curious about 
the violin because he had never seen one before; he had only heard 
his father and mother talk about them, so he decided to see what 
it would do. He picked up the bow and drew it across the strings. 
He liked the sound and tried it again. Wouldn't it be fun to put 
all the sounds together, he thought. Now standing off stage at 
Carnegie Hall, waiting for the applause to die down, this great 
artist thanks his parents from the bottom of his heart." 


Tomkins says about this story: “The ideal of the parents . . . is 
foreign to the child's wishes but he conforms because he does not 
want to disappoint them. Parental impact is great and when he 
reaches maturity, conformance with parental wishes has been suffi-. 
ciently rewarding to bring about complete identification with parental 
values." The second fantasy thus shows an individual who has pro- 
jected unto the figure of the boy his own relationship to his parents. 


Personal Reports and Projective Techniques 189 


Interpretation 


There is at this time no uniform method for analyzing the fantasy 
productions developed by the individuals taking the Thematic Apper- 
ception Test. Many investigators have found it necessary to develop 
independent systems of analysis. Murray's scheme for the interpre- 
tation of TAT data, one of the most comprehensive attempts at ob- 
jectifying TAT: analysis, centers about the concepts of “need” and 
"press." It is assumed that an individual making up a story about a 
picture identifies himself with the figure or one of the figures (hero) 
in the picture. Every act which occurs in the story as initiated by the 
hero is classified in terms of the need or needs which it expresses. For 
instance, violence on the part of the hero would reflect the need of 
aggression. Among Murray's list of needs are those of aggression, 
dominance, achievement, recognition, rejection, sex, deference, harm 
avoidance, and others. The fantasy stories also express "presses," the 
forces which impinge upon the individual. These may be either harm- 
ful or beneficial. In analyzing the stories the psychologist would look 
for the kind of forces fantasied as impinging upon the hero. A list of 
presses would include: family insupport, lack or loss, rejection, birth 
of sibling, deception, etc. The fantasy productions have also been ana- 
lyzed by type of language used, vocabulary, syntax, etc. Other investi- 
gators emphasize types of endings (happy, unhappy, etc.). 

Interpretations of TAT records require a great deal of skill. Never- 
theless, cursory examination of records by the untrained individual 
may provide some insights into problems. As indicated earlier, the 
psychologist has data on stories given by many groups of people. He 
has available a body of data on how persons with a certain kind of 
behavior or personality problem have reacted to these same pictures. 
Thus, by means of the stories developed, he is able to classify his sub- 


ject into one or another group. 


Other Picture Projective Tests 


ques have recently been developed 
to stimulate individuals to express their real feelings about such things 
as the family, different religious groups, and labor. The advantages in 
using pictorial techniques of this kind seems to be that the individual 
interpreting a picture does not feel “on the spot” since he is not asked 
to state his attitude but to interpret a picture outside of himself. The 
limitations of these tests lie in their questionable validity and relia- 


bility, 


Other picture projective techni 


190 Major Evaluation Techniques 


Among the more interesting of these picture tests are the Rosen- 
zweig Picture-Frustration Test (14) and Symonds study (16) of 
adolescent fantasy. 

Rosenzweig's test consists of twenty-four pictures resembling in- 
complete cartoons. Each picture portrays two figures. The person on 
the left in the picture is the frustrating person and is either saying 
something which frustrates or is describing a situation which frustrates 
the person on the right. There is a caption box on the right for the 
person taking the test to fill in the very first reply which occurs to 
him to the actions or words of the figure on the left. For instance, a 
picture might portray the individual at the left splashing mud on the 
person on the right. After looking at the picture and presumably 
identifying himself with the person splashed, the subject would write 
a reply in the caption box under the person pictured at the right. 
The instructions take the following form: *Each of the following pic- 
tures contains two or more people. One person is always shown saying 
certain words to another. You are asked to write in the empty space 
the very first reply to these words that comes into your mind. Avoid 
being humorous. Work as.quickly as you can." Rosenzweig analyzes 
the responses in terms of the direction of the aggression expressed and 
the types of reaction to frustration. 

Symonds developed a test consisting of forty-two pictures designed 
for adolescents. Symonds (16) says that the psychological themes re- 
vealed by the pictures tap the major psychological drives to be found 
in the fantasies of adolescents in our culture. 


Drawing and. Painting Techniques 


Graphic expression has recently received much attention as a pro- 
jective device. Anderson and Anderson (1) devote several chapters 
of their book to this topic. The drawings and paintings of children 
and adults have been evaluated as characteristic of personality struc- 
ture and behavior problems. The art productions of children, ado- 
lescents, adults, the feeble-minded, the neurotic, and the insane have 
been investigated for both their diagnostic and their therapeutic value. 
The free, spontaneous expression achieved in pencil and crayon draw- 
ing, finger painting, or painting with water colors by young children 
has been promising for analysis and description of personality. The 
drawing and painting of adults is often influenced by considerations 
of style and form, which may overshadow the individual characteris- 
tics in the expressive product. 


Personal Reports and Projective Techniques 191 


Though many techniques leave the choice of topic as well as of 
medium to the child, other projective methods utilizing drawing and 
painting prescribe what shall be drawn. It is assumed that by using 
the same expressive problem, individual variations in expression will 
stand out most sharply. The Draw-a-Man Test, earlier devised by 
Goodenough as a measure of the mental age of a child, serves as an 
example of such a method. It consists in telling the child to draw a 
picture of a man. The investigators working with this method, ac- 
cording to Murphy (10), feel that it permits the child to project his 
conception of a man, his understanding of freedom and power, as ex- 
hibited in posture, gesture, and facial expression. Workers practiced 
in analyzing such productions claim that the drawing reveals the kind 
of man the child admires or fears. The child may further assimilate 
his own picture of himself into that of the man portrayed. In looking 
at the picture, one would consider such factors as whether the posture 
of the man drawn displays confidence and self-reliance, what the age 
of the man seems to be, and what secondary sex characteristics are 
included. The problem of drawing a recognizable man, of course, must 
have been solved by the child previous to the use of his drawing for 
interpretative purposes. 

A method providing, perhaps, the freest expression both in terms 
of movement and in choice of color is finger painting. Finger painting 
may be employed with young as well as older children. Bell (2) says 
it provides for rapid and easy creation of colorful pictures without 
the preliminary mastery of a complicated skill in handling tools. The 
necessary materials are paper, preferably glazed and inserted into 
water, and the colors which are added with a spatula and worked 
over with the fingers, hands, or forearms. The pictures when com- 
pleted may be placed on newspapers and pressed with an iron. In in- 
terpretation, the whole process engaged in by the child is analyzed. 
For instance, of significance are the part of the hands used in paint- 
ing, the posture, and the color used, One analyst claims that leaning 
on one hand is an indication of self-consciousness and that leaning 
against the table symbolizes lack of self-reliance. Lack of neatness 
presumably constitutes bad coordination or guilt. Male children pre- 
fer to use the colors blue and green and female children prefer red 


and yellow. ` 
ested that there is a specific relation be- 


Many workers have sugg 
tween the use of certain colors and the presence or absence of be- 


havior and personality problems. Much guessing, as well as intuitive 
insight, lies behind the formulation of such hypotheses. Bell (2) pre- 


192 Major Evaluation Techniques 


sents a lengthy table listing some of these claimed relationships. For 
instance, dark and smudgy colors indicate depression, whereas yellow 
and red indicate expressions of hostility and aggression. Strong yellow 
in some cases has been related to happy feelings, whereas red alone 
may indicate both feelings of affection and love and feelings of ag- 
gression and hate. Criteria of form have similarly been used to de- 
scribe personality. For instance, rigid contours and uniform rhythm 
have been interpreted as indicative of compulsion. Short, little strokes 
and few curved forms may express aggression. 

At present, the status of drawing and painting techniques for diag- 
nosis of personality is not certain. Although there are many hypotheses 
about the aspects of personality and adjustment measured by the art 
techniques, specific relationships have not yet been conclusively dem- 
onstrated. The teacher may wish to experiment with these techniques 
and through them obtain "hunches." The teacher's knowledge of the 
child may help him to understand the meaning of certain forms and 
colors for a particular child. Expression through art has a therapeutic 
as well as aesthetic function and is thus to be encouraged, but the in- 
terpretative hunches gained need validation by using observation and 
other evaluative tools. 


Play Techniques 


Play is a natural activity of the child. In it the child expresses him- 
self spontaneously and without self-consciousness. Informal and more 
formal play situations have been used as projective devices. The child 
has been left free to manipulate and rearrange materials which en- 
courage him to dramatize certain aspects of his environment which 
are of great significance to him. The materials employed have varied 
from the most “unstructured,” such as clay, paste, mud, and cold 
cream; to more structured, such as blocks, mosaic pieces, and beads; 
and finally to "well structured," such as dolls and miniature-life toys 
representing furniture, houses, and people. 

Clinical psychologists and psychiatrists have used play techniques 
not only for diagnosis but also for therapy. As reported in Anderson 
and Anderson (1), the play situation offers the child a cathartic ex- 
perience—that is, an opportunity to release socially unacceptable im- 
pulses and to discharge feelings associated with traumatic experi- 
ences, such as birth of a sibling, lack of parental love, and others. 

Play techniques have recently been widely used to give insight 
into the child’s behavior, ideas, feelings, wishes, attitudes, and fan- 


Personal Reports and Projective Techniques 193 


tasies. By their very popularity, they deserve the interest of teachers, 
counselors, and guidance workers. To what extent the analysis of play 
provides valid data for guidance, however, is a matter of disagree- 
ment. The play techniques consist in exposing the child to play ma- 
terials such as miniature-life toys. He is given no instructions except 
to do what he likes with them. Murphy (10) indicates that the child 
uses the toys to portray his conception of his home, family, neighbor- 
hood, and world. He shows his parents and other adults, his dog, and 
himself, in terms of directly verbalized wishes or wishes which are in- 
directly portrayed in his conduct. Murphy cautions, however, that in 
interpreting the play process it must be kept in mind that much of 
the child's behavior represents a mirroring of reality. Still, a great deal 
of the child's distinctive, egocentric point of view emerges in the play. 

For work with children, play seems to be a very important pro- 
jective technique. However, a large part of what the child does in his 
dreams, his school work, or his contacts with his family has great pro- 
jective value, according to Murphy (10), because the child’s mode of 
expression is still a ditect and spontaneous one. It must be remem- 
bered that no one projective method can reveal all the facets of a 
child’s personality. Play is no exception to this rule, though it is one 
of the most comprehensive measures. It is most useful in focusing 
the attention of the teacher or psychologist on certain areas which 
need further exploration. , 


Miscellaneous Techniques 


There are various other techniques based on the assumption that 
personality expresses itself in many forms and that every individual 
reacts to stimuli in'a characteristic and unique fashion. 

The sentence completion test grew out of the previously accepted 
word association techniques, which had achieved some importance in 
clinical psychology. The sentence completion method presents to the 
individual short phrases that are to be completed to form whole sen- 
tences, Items such as the following are illustrative. 


1. I am unhappy when 
2. I become mad when 
3. I like 
4. I dislike 
5 

6 


. I don't tell the truth when 
. I wont 


194 Major Evaluation Techniques 


The answers which the subject fills in presumably reveal emotional 
and social conflict areas. They are interpreted and scored in many 
ways. Some workers have restricted themselves to three categories of 
interpretation: (a) conflict or unhealthy responses, (b) positive or 
healthy responses, and (c) neutral responses. Others analyze the ma- 
terial more elaborately, classifying responses by areas of rejection, evi- 
dences of resistance, other methods of evasion, recurrent themes, and 
special or atypical associations. Opinions as to the validity of the data 
yielded by this method differ. Bell (2) states that an advantage of this 
technique lies in the ease with which it can be administered to large 
groups. However, he cautions that the validity of various versions of 
the test is not high enough so that it can be employed without other 
corroborative measures. 

Story telling and story completion techniques closely approximate 
the Thematic Apperception Test in that they all use oral and written 
fantasies as their data. While the stimulus in the Thematic Appercep- 
tion Test is pictorial, the stimulus for a story or plot completion tech- 
nique is a written or oral theme which is to be'elaborated. Story tell- 
ing as a projective device is more unstructured than story completion, 
which does impose a limitation upon the child's fantasy. In story tell- 
ing, children are simply asked to make up a story about a boy or a girl, 
or about a father or mother. As reported in Bell (2), three major 
themes emerge from the stories made up by the children: What the 
child is afraid of—anxiety; what the child wishes to be—wish fulfill- 
ment; and what he fears he might do—sadism. 

In the story or plot completion technique, the subject is presented 
with a simple or complex plot to elaborate or to finish. For instance, a 
child may be given the following plot to finish: 


"Little Tommy goes to school in the morning. When the time comes 
to play he stays in a corner by himself . . ." 


(Instructions: Complete the story, telling why Tommy does not play 
with the other children) 


Another variation of this method consists in letting the subject 
choose one of a number of endings to a story which he was told. In this 
way the teacher or experimenter can observe whether the child prefers 
a happy ending, an unhappy ending, desires to reward a likable per- 
son, etc. 

Interpretation of this kind of material poses a difficult problem. 
There seem to be four elements which influence the kind of story a 
child would tell and the outcome the subject might decide on: (a) 


Personal Reports and Projective Techniques 195 


books and movies, (b) events in which a friend or relative partici- 
pated, (c) the narrator's own experiences, subjective and objective, 
and (d) conscious and unconscious fantasies. Though there is a 
tendency to remember and select from experience those things most 
related to the individuals personality and conflicts, a simple analysis 
of the stories, without reference to the sources from which they derive, 
has severe limitations. 

Handwriting analysis as a form of personality interpretation is 
very old. Still, it is difficult to evaluate its validity as a measure of 
personality. Many systems of analysis are used and the "experts" 
differ in emphasis. Some workers in the area interpret handwriting 
as a whole, others make specific reference to length of lines, point 
pressure, and other detailed characteristics of handwriting. It is 
pointed out by handwriting analysts that "no two individuals, even 
when copying the same model, can carry out the same movement." 
Both factors of organization and deep level motivation express them- 
selves in handwriting. Murphy (10) lists five areas to which analysts 
pay attention in making graphological interpretations: 

a. The expansiveness and contractility in handwriting—how 
much space is filled up. 

b. The “emphasis”—speed and intensity, the amount of pres- 
sure applied. 

c. The style—aesthetic qualities such as symmetry, evenness, 
and unevenness. . 

d. Concern for social appearance—factors of orderliness, con- 
cern for correctness. 

e. Attitudes toward the self—how writing one's name, for ex- 
ample, in terms of extension and contraction compares 
with other words in the text. 

The validity of a handwriting analysis still largely depends on the 
"expert," the system used, and the frame of reference. Graphology re- 
mains a promising but inconclusive area of measurement research. 
Unfortunately, the employment of graphology as a pseudo-science in 
newspaper features has obscured the real progress made. 


Summary 


The number and variety of personal reports and projective tech- 
niques for evaluating personality is so large that a complete inventory 
of them would require an entire volume, rather than a chapter in a 
volume. In this chapter, however, a survey of representative and most 


196 Major Evaluation Techniques 


commonly used techniques is presented. Personal reports include self- 
descriptive inventories and problem checklists with standardized 
questions or statements, as well as the autobiography. These methods 
require less specialized background and training for administration 
and interpretation than various projective techniques, which only spe- 
cialists can administer and interpret. 

Self-descriptive inventories and problem checklists evaluate such 
specific aspects of personality as self-sufficiency, dominance, submis- 
sion, introversion, extroversion, social adaptability, and anxiety about 
health, relationships with others, home and school problems. Analysis 
of an autobiography gives clues about modes of thinking and feelings 
of an individual, but these clues should be corroborated by other ob- 
servations and data. 

Projective techniques utilize such relatively unstructured stimulus 
situations as ink blots, thematic pictures, incomplete sentences, draw- 
ings or paintings, and toys or puppets to which the individual re- 
sponds, thus expressing his way of seeing life, its significance and 
patterns, and his own meanings and feelings. Among the most widely 
used projective techniques is the Rorschach test. The Thematic Apper- 
ception Test is also widely used by clinical psychologists. 

Other projective techniques include: variations of thematic pictures; 
analysis of paintings and drawings; play techniques using clay, mosaic 
pieces, dolls, puppets, and other toys; sentence completion and story 
completion tests; and analysis of handwriting. 

The validity and reliability of the various techniques vary consider- 
ably and are functions of the rapport between the examiner and the 
examined individual, the insight of the individual, and the expertness 
of the examiner in the interpretation of responses, Although the differ- 
ent techniques provide valuable clues to the nature and organization 
of personality, interpretations from any one should be corroborated by 
data gathered from other techniques and from observation. 


Problems for Class Discussion 


1. Administer a personality inventory or problem checklist to one or more 
pupils. Study the results to determine whether or not you can gain addi- 
tional insight about the pupils’ attitudes and personal-social behavior 
from the data. 

2, Administer a sentence completion test to a class of pupils. Analyze the 
answers to gain the following tentative insights: (a) Which pupils give 
responses which merit further study? (b) Which pupils give mostly 
normal responses? (c) Which pupils give mostly neutral responses? ` 


Personal Reports and Projective Techniques 197 


References Cited in This Chapter 


l. Anderson, H. H., and Anderson, G. L., editors, An Introduction to 
Projective Techniques. New York: Prentice-Hall, 1951. E 

2. Bell, J. E., Projective Techniques. New York: Longmans, Green and Co., 
1948. 

8. Buros, O. K., editor, The Third Mental Measurement Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949. : 

4. Cronbach, L. J., Essentials of Psychological Testing. New York: Harper 
& Brothers, 1949. 

5. Division of Research and Guidance, Los Angeles County, Guidance 
Handbook for Secondary Schools. Los Angeles: California Test Bureau, 
1948. 

6. Frank, L. K., “Projective Methods for the Study of Personality," Journal 
of Personality, 8:889—418, October, 1939. 

7. Hunt, J. McV., Personality and Behavior Disorders. New York: Ronald 
Press Co., 1944. 

8. Krugman, M., "Rorschach Examination in a Child Guidance Clinic," The 
American Journal of Orthopsychiatry, 11:503-511, July, 1941. 

9. Mooney, R. L., and Price, M. A. Manual to accompany Ross L. 
Mooney's Problem Check List, College Form. Columbus: Bureau of 
Educational Research, Ohio State University, 1948. 

10. Murphy, G., Personality. New York: Harper & Brothers, 1947. 

11. Murray, H. A., The Thematic Apperception Test Manual. Cambridge, 
Mass.: Harvard University Press, 1943. 

12. Remmers, H. H., and Gage, N. L., Educational Measurement and Eval- 
uation. Revised edition. New York: Harper & Brothers, 1955. 

18. Piotrowski, Z. A., “A Rorschach Compendium,” The Psychiatric Quar- 
terly, 21:79-101, February, 1947. 

14. Rosenzweig, S., Rosenzweig Picture-Frustration Study. Pittsburgh: 
Western State Psychiatric Hospital, 1944. : 

15. Strang, R., Counseling Techniques in College and Secondary School. 
New York: Harper & Brothers, 1949. 

16. Symonds, P. M., "Inventory of Themes in Adolescent Fantasy," Ameri- 
can Journal of Orthopsychiatry, 15:318-328, April, 1945. 

17. Tomkins, S. S., The Thematic Apperception Test. New York: Grune 
and Stratton, 1947. 

18. Traxler, A. E., The Use of Tests and Rating Devices in the Appraisal 
of Personality. New York: Educational Records Bureau, 1938. 


References for Further Reading 


Anderson, H. H., and Anderson, G. L., editors, An Introduction to Projective 
Techniques. New York: Prentice-Hall, 1951. 

This book provides a good survey of the large variety of projective 

techniques that have been devised to diagnose and evaluate personality. 


198 Major Evaluation Techniques 


Bell, J. E., Projective Techniques. New York: Longmans, Green and Co., 
1948. 
This volume reports and appraises the numerous and varied projective 
- techniques that have been applied to the evaluation of personality. It is 
well documented with references to representative studies made for each 
technique. 
Strang, R., Counseling Techniques in College and Secondary School. New 
York: Harper & Brothers, 1949. 
The emphasis in this book is upon the application and use of data de- 
rived from various tests and techniques in counseling or guidance of the 
individual. 


CHAPTER ELEVEN | Sociometric Methods 


The modern teacher is interested in promoting the so- 
cial adjustment of each child. Trained to understand the interrelation- 
ships among physical, emotional, intellectual, and social factors in the 
learning process, the teacher needs to know how well and in what 
ways each child gets along with his peers. In many cases, the socially 
rejected child may become sufficiently disturbed emotionally to be 
hampered in his intellectual development, or, conversely, social mal- 
adjustment may be symptomatic of personal problems outside of school. 
The importance of the “social status” factor in the development and 
the make-up of the individual is great enough to warrant the expendi- 
ture of considerable teacher time and energy in diagnostic work. 

More specifically, of what importance is the relation of the indi- 
vidual to the group? The development of one’s personality is a product 
of the interaction of the individual with other people. One’s ideals, 
one’s motivations, and one’s many pleasures are products of social 
interaction. Also, one’s sense of security is based on the nature of the 
Sroup’s acceptance or rejection. According to the ideals of a demo- 
cratic society, it is important for individuals to get along with one 
another if a stable, functioning society is to be maintained. At a time 
when unity must be attained amidst diversity of political, economic, 
religious, and social beliefs, the principle of harmonizing the interests 
of the individual and the group, the personal and the social, is funda- 
mental to our way of life. For reasons significant for individual as well 
as social growth, therefore, the determination of the relationships 
which exist among the individuals in the group is an essential step in 
the diagnosis and reconstruction of individual and group living. 

Sociometry may be described as a means of presenting simply and 
graphically the entire structure of relations existing at a given time 


among members of a given group (5). Modern sociometry is of rela- 
199 


200 Major Evaluation Techniques 


tively recent development. Beginning with the post-war (World War 
I) work of Moreno, whose Who Shall Survive? was the first book in 
the field, many investigators have contributed numerous theoretical 
and practical insights into the field of sociometry. Although sociometric 
testing is relatively new, the tests have yielded data of great educa- 
tional significance. Teachers now have available many sociometric ap- 
proaches which suit the varying needs of individual classrooms. This 
chapter discusses the construction and use of several types of socio- 
metric devices. 


Types of Sociometric Devices 
THE NOMINATION TECHNIQUE 


, - In this relatively direct form of approach, a question is so framed 
that the individual is asked to name a limited number of people from 
within the group with whom he would choose to associate on the basis 
of some stated criterion (6). A typical question which could serve as 
the basis for an analysis of pupil choices might take this form: 


What other boys and girls do you want to sit next to you for 
the next month? You may have three choices. Name the boy 
or girl you most want to sit next to as your first choice, then the 
one you want as second choice, and as third choice, It is hard 
to arrange seats so that everyone will have all his choices, 
but we will try to make sure that everyone will have at least 
one of his choices. 


It is important that the questions used as a basis for choices refer 
to real situations, and that some action will grow out of the choices 
which are made by the respondents. 


THE RATING-SCALE APPROACH 


Devices using this form ask the respondent to rate all the other 
members of his group, using some predetermined scale. Typical of 
this approach is the Ohio Social-Acceptance Scale, published by Ohio 
State University. The following six descriptions serve to guide the re- 
spondent in assigning scale values to his peers: 


1. I would like to have this person as one of my very, very 
best friends. I would like to spend a lot of time with this 
person. I would enjoy going places with this person. I 
would tell some of my troubles and secrets to this person 
and would do everything I could to help this person out of 
trouble. . . . 


Sociometric Methods 201 


2. I would enjoy working and being with this person. I would 
invite this person to a party, and would enjoy going on 
picnics with this person and our friends. I would like to 
work with this person and I would like to be with this 
person often. I would like to talk and make and do things 
with this person. I want this person to be one of my 
friends. . . . , 

8. I would be willing to be on a committee with this person 
or to be in the same club. It would be all right for this per- 
son to be on the same team with me or to live in my 
neighborhood. I would be in a play with this person. I 
would just as soon work with this person in school. This 
person is not one of my friends, but I think this person is 
all right... . 

4, I do not know this person very well. Maybe I would like 
this person, maybe I wouldn't. I don’t know if I would 
like to be with this person. . . . 

5. I say “hello” whenever I meet this person around school or 
on the street, but I do not enjoy being with this person. I 
might spend some time with this person if I didnt have - 

, anything else to do, but I would rather be with somebody 
else. I don't care for this person very much. . . . 

6. I speak to this person only when it is necessary. I do not 

like to work with this person and would rather not talk 


to this person. 

Other investigators (4) have used a five-point rating scale: (1) 
Very, very best friends, (2) Good friends, (8) Not friends, but okay, 
(4) Don't know them, (5) Not okay. Experience has indicated that 
the ratings assigned to the members of a group may be looked upon 
as points on a continuum. As such, the statistical treatment of the 
ratings assigned and received is relatively simple. 


THE *WHO'S WHO" APPROACH 

This sociometric technique constitutes an excellent means of obtain- 
ing insights into pupil difficulties in meeting the standards set by their 
peers. The Ohio Recognition Scale, published by Ohio State University, 
a typical instrument of this type, is an anonymous means of securing 
pupil judgments concerning their classmates. A portion of the direc- 
ioe and some of the items included on the scale are reproduced be- 
ow: 


In this booklet there are many paragraphs. They tell about different 
kinds of boys and girls. As you read each paragraph, ask yourself: 


202 Major Evaluation Techniques 


“Ts there anyone in our room like this?" If there is, put that person's 
name under the paragraph. If you think of more than one person, 
write these other names, also. If there is nobody in your room like 
this, write “Nobody” and go on to the next paragraph. 


1. Do we have any boys and girls in our room who are very even- 
tempered, who almost never get upset or angry, who are al- 
ways calm, even when things go wrong? When somebody 
shouts at them, or even hits them, they don't get excited. They 
are always cool and level-headed. Who are they? 

2. Some boys and girls seem to be unhappy most of the time. 
They don't seem to know how to enjoy themselves. It's no fun 
having them around because they almost never laugh or tell 
funny stories or good jokes. They are almost never happy. Who 
are they? 

5. Some boys and girls always seem to feel at home wherever 
they are. They are not afraid to say what they think in class 
discussions. They ask questions if they don't understand some- 
thing. They don't mind meeting strangers and they can talk 
easily with grown people and older children. Who are some 
children who are almost never shy or bashful? 

9. Suppose you were going to choose people from this class to be 
on your committee. You want boys and girls who work well 
with other children, who will have some good ideas, who will 
work hard, and who will stick to the job until it is finished. 
They would know how to plan and they would do good work. 
In this class who would you choose for this committee? 

11. Are there any children in our room who are very friendly with 
everyone? These boys and girls are not snobbish or “stuck-up.” 
They like you as well today as they did yesterday—they al- 
ways act the same toward you. They like everybody. They take 
time to talk to you and seem interested in you. Who are they? 


The "Who's Who" approach undoubtedly gives the teacher informa- 
tion which he cannot gain through the use of the other devices. The 
major disadvantage of this technique lies in the difficulty of analyzing 
pupil responses in order to arrive at a measure of an individual's status 
in the eyes of the group. 


Some Uses of Sociometric Methods 


PROMOTING SOCIAL ADJUSTMENT 


One of the most important uses to which sociometric techniques 
may be put by the teacher is the identification of children who are in 
need of help in adjusting to the group. Research has indicated that the 
pupil who is high in sociometric status tends to express himself in 


Sociometric Methods 208 


terms of characteristics which the group values (1) and to direct his 
energies to activities which the group approves (8). In a sense, socio- 
metric scores are a measure of the degree to which a pupil undertakes 
activities which are highly regarded by his classmates or conform to 
the group's demands. The score which a pupil obtains on a sociometric 
test, then, may be looked upon as a measure of his drive toward social 
adjustment. 

i Another use of the sociometric test is to promote common interests, 
ideals, and skills among those individuals who do not seem to be 
Sharing such experiences. In a group of. twenty-five to thirty-five 
pupils, several children may be too shy or otherwise not socially effi- 
cient enough to associate with others. The sociometric test is of value 
in revealing such feelings and in helping the teacher set up situations 
for promoting social adjustment. 

The following sociometric test, disguised as an interest inventory, 
was used with a church group. This inventory can be modified for 
classroom use by inserting other types of activities and appropriate 
student names. 


PART oNE—Directions: Fill in the following information. It is not neces- 
Sary to put your name on this paper. If you like the activity, place an L 
m the space provided. If you are indifferent to the activity, place an I; 
if you dislike the activity, place a D. 


L—Like I—Indifferent D—Dislike 
——Swimming . Go boating or canoeing 

Ride horseback Go to a summer camp 
— Hiking “Rough it" in the woods 


Go on picnics . . Organize a baseball team 

Hunting and fishing — Pitch tents and set up camp 
. Build outdoor fires 

Go bicycling . . Organize food planning 


— — Play badminton Lead a group on a hike 
Take photographs . Planning for others’ fun 
Cook outdoors , . . Be a committee chairman 


Lead games outdoors 


PART two—Directions: If you would like to engage in these activi- 
ties with the following people most of the time, place an M in front 
of their names. If you would like to engage in these activities with 
them some of the time, place an S in front of their names. If you would 
like to engage in these activities with them none of the time, place an 
N in front of their names. 


204 Major Evaluation Techniques 


M—Most of the time S—Some of the time N—None of the time 


— — Marian A. Jean G. Bill R. 
Jon A. June J. —c—Pat S. 
Don B. Carol K. — — Marian S. 
Barbara B. —— Dick L, Marge S. 

— —-Eddie C. — Earl M. Shirley S. 


Of course, one should not assume that each pupil necessarily knows 
where his best interests lie, but student choices do represent at least - 
one aspect of the problem of setting up socially desirable goals. 


GROUPING PUPILS 


The sociometric device is also very valuable in the process of set- 
ting up committees for projects and for seating arrangements. Going 
on the assumption that it makes a difference in work efficiency and 
social pleasure when one works with persons of one's own choice, seat- 
ing students according to their expressed likes is a desirable practice. 
Of course, it may not always be possible to give each individual his 
choice, nor may it be best for the individuals concerned. Where fea- 
sible, however, the pupil's choice should play a part in the process of 
assignment to seats at tables or to committee work. It is through the 
unknown (to pupils) manipulations of the environment that some 
pupils may be helped to get along more efficiently in the group. Typi- 
cal questions asked of pupils are: “Whom would you like to sit next 
to?" "Whom would you like to work with on a committee?" "Name 
five pupils you would like to have at your table." *Name five students 
you would like to have on your committee." These questions tend to 
elicit some of the information necessary for grouping pupils most 
effectively. Obviously, for final groupings, this information must be 
combined by the teacher with information about many other pupil 
characteristics and needs. 


MEASURING GROWTH IN GROUP STATUS 


The sociometric test may be employed to describe growth in group 
acceptance. At the beginning of thé term one may find, as a result of 
administering a sociometric test, that each 'pupil may desire to work 
with certain other pupils, play with certain pupils, sit next to certain 
people, etc. If this test is administered periodically, the teacher might 
be able to trace the growth in the group acceptance of any given 
pupil. For instance, nobody may choose to sit at a table with James 
at the end of a month in school. As a result of the efforts of the teacher 
(and students too) James might have become more acceptable as 


Sociometric Methods 205 


reflected in the second test. This change might be due to James’ desire 
to change, the group’s better understanding of James, or a combina- 
tion of both factors or others. In any case, testing and retesting may 
reveal changes and possible growth or lack of growth in group ac- 
ceptance. 


DETECTING CLEAVAGES 

The sociometric test is a fine method of revealing cleavages among 
pupils along undemocratic lines. When pupils reject one another on 
the basis of blind group prejudices or accept one another on the basis 
of irrational likenesses, one may say that the group reflects undemo- 
cratic patterns. By analyzing the choices of pupils in their friendships, 
work relationships, play relationships, and the like, one may detect 
possible prejudices in operation which reflect such undemocratic be- 
havior. Although there are differences of opinion as to the meaning of 
democratic behavior, still the teacher is obligated to promote some 
conception of democracy which involves a respect for the individual 
and a sense of equality of opportunity and responsibility for all. 

The teacher must remember that pupil responses to sociometric 
tests tend to reflect the fundamental characteristics of society outside 
of the school. When the teacher tries to find out why John will not 
work or play with James, he may learn that James lives across the 
tracks, and does not have the clothes, social habits, or the pocket 
money to keep up with people like John. Cleavages such as those of 
race, religion, and socio-economic status have deep roots, and the 
teacher alone cannot hope to cope with them. The teacher must work 
with others in and out of school, as a professional and as a citizen, to 
facilitate democratic social living. Simply telling one’s pupils to be 


“nice” to one another will not lead to social betterment. 
Jennings (2) has drawn up a very comprehensive "Sociometric 
Analysis Schedule" to guide the teacher in promoting these general 


uses of the sociometric technique. 


SOCIOMETRIC ANALYSIS SCHEDULE 


1, What appears that you had expected would appear? 


. What appears that you had not expected to appear? 
certain pupils being the most chosen and receiv- 


to 


8. What seems to account for 
ing few, if any, rejections? 
4. What seems to account for certain pupils being unchosen or receiving many 
rejections? 
. What seems to account for t ? 
6. What seems to account for the mutual rejections? 


he mutual choices? 


[i 


206 Major Evaluation Techniques 


7. Can you think of any classroom arrangements which may account for the 
above choices or rejections? 

8. As you read the structure as a whole, do you think of any arrangements such 
as classroom routines, lunchroom arrangements, play patterns, which might be 
a factor in the general patterning of the sociogram? 

9. What cleavages, if any, appear in this sociogram? Cleavage is here defined 
as an absence of choices between individuals related to a "group factor." 
Examples: boy-girl, economic, nationality background, religion, academic abil- 


ity, being employed after school, prestige of some special group, other group . 


factors. 


10. Can you see any spots in the structure of the group as a whole that need to 
be more related to the rest of the class group for better morale, such as a clique 
by itself, several mutually choosing children, other children trying to get in 
with no response? 


ll. In the light of your analysis of their interrelation—structure, what under- 
standings and skills do you estimate they have already well developed? Which 
do you estimate they need to develop further? 


19. What do the majority of most-chosen children have in common? Examples: 
race; don't work after school; socioeconomic level, fairly well off; live in open 
community and not in housing project; most are Protestants; most have lived 


in this community all of their lives; most take part in after-school and in-school 
activities. 


18. What do the unchosen and rejected children have in common? Examples: 
different nationality; much lower socioeconomic level than rest; most live in 
housing project; many of them work after school; many are new to community; 
don't participate in in-school and out-of-school activities; present many dis- 
cipline problems. 


14. Are there visible signs of segmentalization in your community—association 


patterns which divide according to race, religion, residence lecation, or any 
other factor? 


‘Constructing and Administering Sociometric Tests 


There are no formal steps which have been laid down on how to 
construct sociometric tests. In general, the basic principles for con- 
structing any test or test question apply. The specific purpose of the 
teacher (and of the class, if there is pupil-teacher planning) is to de- 
termine the form and content of the sociometric test. Also, the educa- 
tional situation in large measure plays a significant part in the nature 
of the test. i 

There are many occasions in the educational process when the 
sociometric test may be used profitably for a variety of purposes. At 
the elementary-school level, there are game situations in which one 
pupil must select another, such as in the ring game, the Farmer in 
the Dell, the farmer takes a wife, the wife takes a child, etc.; these 
selections may reflect social groupings, though for children in kinder- 
garten and first grade, associations are generally temporary. In the 
lower grades, where there are generally self-contained classrooms, 


Sociometric Methods ` 207. 


there are possibilities for electing representatives to extend invitations 
to other classes, clean-up squads, milk distribution squads, and com- 
mittees for carrying out a project. Also, where pupils sit at tables, 
monthly selections of tablemates have the elements for determining 
individual and group status. 

In the higher grades, selection of others in the class can take place 
in numerous ways. If there is a class group of officers, etc., such vot- 
ing can reflect student position. Students can select teams, group 
leaders for excursions, members of committees to work on a play 
(costumes, sets, actors, etc.), a class picnic or outing, tablemates, etc. 
In all cases, the teacher can satisfy the students that the “test” is serv- 
ing a practical purpose they can understand, though for himself the 
teacher should go beyond this surface set of facts and determine the 
social structure of his class. 

Insufficient attention has been paid to the'problem of when to ad- 
minister sociometric tests in the course of the school term. A common 
abuse is scheduling by impulse—the teacher suddenly administers a 
sociometric test because it is a “good idea,” because “it is being done,” 
was mentioned in a school or professional society meeting, or was dis- 
cussed in a course in measurements. It is wise to remember that socio- 
metric testing (like other testing) should be intrinsic to and continuous 
with the regular work of the class. The sociometric test should be an 
important and useful exercise which helps carry the class forward to- 
ward its curricular objectives. 

The sociometric test may consist of one or more questions which 
elicit from the student his preferences, on the basis of some criteria or 
task, for classmates, schoolmates, or extra-school friends. Sometimes, 
in addition to simply selecting other youngsters, the student may be 
asked (when appropriate) to give his reasons for selecting or reject- 
ing different ones. In this way, the teacher can collect valuable evi- 
dence on the reasons, apparent or underlying, why pupils are popular 
or unpopular, accepted or rejected. 

An argument frequently raised against sociometric testing is that it 
tends to crystallize in the mind of a pupil his feelings of antagonism 
toward other pupils or his rejection of them. In other words, some 
teachers fear that the test will put ideas into an immature child's head. 
This may hold true when the sociometric test is improperly employed. 
If the students do not know each other well enough to have formed 
likes and dislikes, the sociometric test should not be used, since it 
cannot reveal what does not exist. More frequently, however, such a 
fear on the part of a teacher reflects naiveté. 


208 Major Evaluation Techniques 


What questions may be used profitably to illuminate the status of 
group acceptance of the individual? Insofar as possible, questions 
should arise out of the regular activities of the classroom. The teacher 
should be sensitive to the values of certain occasions to obtain socio- 
metric data. The following examples may be considered as guides for 
drawing up specific questions more appropriate to a given classroom. 


1. We will need five people to serve as personal representa- 
tives of our class, whose job it will be to invite other 
classes to participate in our circus festival. Name the five 
people you want on this slip of paper. 

2. We need a bulletin board committee of three pupils. What 
three people would you want on the committee? 

8. We will need two boys and two girls to plan our class 
party. Choose the four people and tell why you would 
have each one on the committee, 

4. Do you have any preferences about who should be in your 
group when we go to visit the museum? Name the four 
pupils whom you would like to accompany you. Are there 


any pupils in the class whom you would not like to have 
in your group? 


ADMINISTRATION 


The effectiveness of the sociometric test, it must not be forgotten, 
is highly dependent on the teacher who uses it. A test which is valu- 
able in the hands of one teacher may be quite ineffectual in the hands 
of another. When a teacher is concerned with such highly subjective 
and emotionally charged material, the type of relations the teacher 
has with the students will make all the difference in the world in the 
responsiveness and honesty of the students. Sociometric techniques 
are of most value in classrooms where the teacher has maintained a 
generally informal classroom atmosphere. 

Some explanation of the purposes and uses of the test (it should not 
be called a test, because then students may react as though it were an 
"academic test") should be given the students so that they will be 
properly motivated to follow directions. The directions should include 
the specific manner in which students are to record their choices. 

The sociometric test question can be presented in various forms in 
the classroom. The question may be in written form, one copy for 
each child, it may be written on the blackboard, or it may be presented 
orall. Whatever form is used, separately or in combination, the 
student needs to be able to record his choices on a sheet of paper or 
index card. Sometimes, it may be wise to mimeograph a list of names 


Sociometric Methods 207 


(alphabetically or by seating arrangement) after which a student can 
record first, second, and third choices or just three checks or X marks. 

The test should be administered so that each individual's preferences 
are kept confidential. Since the sociometric test involves data which 
are packed with dynamite (the personal feelings and preferences of 
children, and which reflect the ideals and practices of their families 
and their community), the teacher must be. careful and considerate in 
his use of such tests and their results. The teacher who promises to 
keep results confidential must observe this practice. To do otherwise 
may serve to widen the breach between some pupils and their class- 
mates, and to destroy pupil-teacher rapport. 

It is always a problem to the teacher whether to have pupils sign 
their ballots for choices or submit them anonymously. While there are 
times when anonymity may be advisable, in most instances all ballots 
should be signed. Signed ballots are a necessity in sociometric testing 
when sociograms are to be constructed. 


TABULATION 

Let us assume that the teacher has asked each of a class of twenty- 
five pupils to list the three pupils in the class with whom he best likes 
to work. Let us assume that the sample slip is a quarter of a regular 
84%” x 11" sheet of paper. Each slip might then look like this: 


of the one who made the three selec- 
draw up a table which includes the 
ally as well as horizontally as 
1 list one may substitute num- 


The name at the bottom is that 
tions, The teacher should then 
names of the twenty-five pupils vertic: 
Shown on page 210. (On the horizonta 

ers corresponding to the names.) ; 

The tabulation is begun by placing a mark (x) in the appropriate 
columns on line five corresponding to the name of Stanley D. This 
process of: tabulating continues until the three choices of the other 
twenty-four individuals are recorded. Totals can be drawn up on a 
Separate line at the bottom to indicate the number of times an indi- 


210 Major Evaluation Techniques 


vidual was chosen. If there is no interest in keeping track of who 
selected whom, one may simply keep tallies next to the list of names. 
If the names of the students are listed alphabetically on a mimeo- 
graphed sheet, it is possible to line the papers up so that one can 
tally horizontally the number of check marks after each name. Gen- 
erally, it is best to tally each question completely rather than a num- 
ber of questions at one time, because the latter often leads to errors 
and confusion. Since it is quite human to make errors, it is always wise 
to check each tabulation at least once. 


1 [2]8 [4 [s [e |z s [o fno]... .[25 
-Benjamina | | | ele IE E 
. Frank B. - ra I a 


. Karen C. 
. Helen C. 


5. Stanley D. BE 


wl] owlr] 


6. John E. 


Ve cla 


dni 


7. James F. 
8. Mary G. I 
9. Edith G. 


= 
M 
10. Barod E | | E 
F 


T 
| 


|| 


25. Mary S. 


PRESENTATION OF RESULTS 


The easiest method of presenting simple tabulated data is to place 
the results on a table. However, the results of sociometric testing are 
usually reported in the form of a sociogram. Figure 5 is a theoretical 
sociogram illustrating some of the basic patterns found when socio- 
metric choices are made. 

Each circle with number enclosed represents an individual. The 
lines tipped by an arrow represent a choice. A line tipped at both 
ends indicates that choices were mutual. The most selected individual 
(11) is called the "star." The individual who was neglected by the 
group (20), is referred to as the "isolate." 

Individuals (1), (2), (8), and (4) forming the square are called a 
"clique," though they may also represent a cleavage. A "triangle" and 


Sociometric Methods 2n 


=) 

[X] 
G-—-@) 
© 


© 
ril s 
G-p- 
EM NC 


FicurE 5 A Sociogram 


a "pair" are similar to the square except that three and two persons 
respectively are involved in mutual choices. A “chain” represents indi- 
viduals who are connected to each other like (17), (15), (12), (6), 
(8), (18), (11), (16), etc. . 

In a situation where the above sociogram was obtained, the teacher 
could set up certain positive goals. If there were other evidence that 
the sociogram had validity (that (1), (2), (3), (4) were a clique, 
that (20) was an isolate, that (11), (16), and (13) were most popu- 
lar, etc.), the teacher might try to apply the results educationally by 
attempting to arrange projects so that the clique gradually falls apart, 


212 Major Evaluation Techniques 


the isolate becomes acceptable to the others, and the class becomes 
more integrated. If this sociogram represented whom each one wanted 
at his table, the teacher might place one of the popular ones (11), 
(13), (16), (19), or (15) at each table to try to breach the gap. 
These recommendations are based on the assumption that the cleav- 
ages and isolate phenomena in this class have superficial foundations. 
If the four persons in a cleavage were members of a minority group 
against which the class had deep-seated prejudices, it is unlikely that 
physical manipulation of chairs would remedy the situation. If any 
given individual in the class is under study, the sociogram might be 
constructed about that individual. Also, if there is a group of individ- 
uals under analysis, the group may be the focus of the sociogram. 
Whenever one draws up a sociogram, it may be necessary to rearrange 
circles a few times before achieving the best descriptive presentation. 

When integrating data, the teacher must be cautioned not to over- 
generalize on the basis of limited data. If the teacher asks the class to 
name the five persons with whom each student would like to work 
on a class art project, it does not follow that the results can be used 
to reveal friendship patterns. A good friend with no talent may be 
neglected in favor of a good artist who is personally not attractive to 
the student. There is a tendency for teachers to label certain students 
"isolates" because nobody chose them as one of their three best 
friends, etc. Much caution should be employed by the teacher in the 
interpretation of test results. If the teacher is interested in drawing 
certain generalizations about each child, it is well to incorporate into 
the original purposes and content of the test (or group of tests) those 
questions whose answers will permit such conclusions. 


Summary 


Sociometric methods are a means of determining the relationships 
existing among members of a group at a given time. Several methods 
have been found useful in determining such relationships: (1) an in- 
dividual may be asked to name a limited number of people from 
within the group with whom he would choose to associate on the basis 
of a stated criterion, (2) an individual may be asked to rate all the 
other members of his group, using some predetermined scale, or (3) 
an individual may be asked to select those members of his group who 
show defined characteristics. Sociometric techniques have been found 
to be of use in promoting the social adjustment of pupils, in promot- 
ing common interests and skills of a group, in grouping pupils for 


Sociometric Methods 213 


various class activities, in measuring growth in group status, and in 
detecting cleavages which reflect social standards among group 
members. 

While there are no formal rules for the construction of sociometric 
tests, the alert teacher will find numerous opportunities for the de- 
velopment of sociometric devices in the course of his normal class- 
room work. He must be careful not to violate the confidence of his 
pupils, and should be certain to utilize the choices made by his pu- 
pils in order to put their expressed desires into effect. Tabulation of 
the results of sociometric testing is relatively simple. The results are 
usually presented via a sociogram, which lends itself to the ready iden- 
tification of cliques, isolates, and "stars." 


Problems for Class Discussion 


l. How well do you know your own pupils? Make a list of the most accepted 
and most rejected pupils in your class (or in one you select). Construct 
a sociometric test to determine the relative popularity of the children in 
the group. 

2. This chapter has not indicated how the results of sociometric testing via 
the rating scale approach might be summarized. How would you tabu- 
late pupils' responses using this approach? How could you arrive at a 
measure of pupil status? 

8. Using a suitable question to secure choices from a class with which you 
have contact, construct a sociogram to indicate pupil interrelationships. 


References Cited in This Chapter 


Grapko, M. F., “A Study to Estimate the Degree of Relationship Between 
Certain Personality Traits and Social Status at a Boys' Summer Camp. 
Unpublished M.A. thesis, University of Toronto, 1946. : 

2. Jennings, Helen H., Sociometry in Group Relations. Washington, D. C.: 
American Council on Education, 1948. M" 

8. Jourard, S. M., “The Relationship Between Outgoing Energy and Social 
Acceptance Among Children." Paper presented at meeting of the Ca- 
nadian Psychological Association, 1949. j 

4. Justman, Joseph, and Wrightstone, J. Wayne, "A Comparison of Three 
Methods of Measuring Pupil Status in the Classroom,” Educational and 
Psychological Measurement, 11:362-867, Autumn, 1951, 

5. Moreno, J. L., Who Shall Survive? A New Approach to the Problem of 
Human Inter-Relations. Washington, D. C.: Nervous and Mental Disease 
Publishing Company, 1934. 

6. Northway, Mary L., A Primer of Sociometry. Toronto: University of 

Toronto Press, 1952. 


E 


214 Major Evaluation Techniques 


References for Further Reading 


Northway, Mary L., A Primer of Sociometry. Toronto: University of Toronto 
Press, 1952. 
An excellent introduction to the entire field of sociometric techniques, 
written by one of the leaders in the field. 


Institute of School Experimentation, How to Construct a Sociogram. New 
York: Teachers College, Columbia Univ., 1947. 
A step-by-step analysis of the process of constructing a sociogram. Of 
considerable help to beginners. 


Cook, Lloyd A., “An Experimental Sociographic Study of a Stratified Tenth 
Grade Class,” American Sociological Review, 10:250-261, April, 1945. 
An account of the attempt to change interpersonal relationships in a 


classroom. The article illustrates the way in which a sociogram helps 
to reveal the structure of the group. 


CHAPTER TWELVE | Case Studies 


The case study or case method involves the integra- 
tion and use of comprehensive data about an individual as the basis 
for diagnosing and interpreting conduct or behavior. The basic data 
of the case study are usually supplied from examinations of the indi- 
vidual by such specialists as the psychologist, social worker, physi- 
cian, and teacher. It is a method of appraisal that concerns itself with 
the careful examination of physical and psychological factors that are 
significant in the life of the person under study. Emphasis is placed 
on discovering what is unique in the relationships among factors in 
each case rather than what is characteristic of a large number of in- 
dividuals. The findings have significance for the treatment of malad- 
justments displayed by the individual. It is a diagnostic and remedial 
procedure based on a thorough investigation of a person in order to 
acquire knowledge of his history, his home conditions, and all of the 
things that may have contributed to his behavior difficulties. 


The Place of the Case Study in Evaluation 


The case study usually contains a description of the behavior in- 
dicative of maladjustment, physical and health conditions based on an 
examination, family background and history, early childhood history, 
mental capacity and educational achievements as revealed by tests 
and observational methods, the individual's personality, and efforts al- 
ready made to improve the adjustments. Thus, the case study is a 
primary tool of evaluation. It permits the synthesis of all pertinent 
facts about an individual within one framework. It emphasizes the 
uniqueness and individuality of the human personality. Pupil prob- 
lems which are not susceptible to diagnosis by group techniques may 
be approached through this method. The data gathered, according to 

215 


216 Major Evaluation Techniques 


Crow (1), should be restricted to those things which are considered 
relevant to the present difficulty of the student. 

The purpose of case study is two-fold: diagnosis and treatment (2). 
First, the underlying reasons for behavior or learning problems are 
discovered; second, a course of treatment or remedial work is under- 
taken. The case study method may also be used to gather facts and 
gain insight about an individual for the purpose of helping him to 
develop his best potentialities (5). The objective of the case study 
is not primarily that of research, and, although many case studies 
lead to a body of scientific generalizations about the meaning of cer- 
tain forms of behavior, the function of providing material for the ex- 
pansion of scientific knowledge may be considered a by-product. 
Rather, the individual is studied in order to help him overcome his 
learning problems or adjustment difficulties. The method further per- 
mits the evaluation of the treatment or remedial work undertaken 
after diagnosis. 


DEFINITION OF THE CASE STUDY TECHNIQUE 


The case study has been defined as “detailed study of an individ- 
ual, conducted for the purpose of bringing about better adjustment 
of the person who is the subject of the investigation" (7). It repre- 
sents an analysis of an individual's assets and liabilities by school, 
medical, and psychological personnel (2). It is desirable that a case 
study possess the following characteristics: completeness of data, con- 
tinuity, confidential recording, and scientific synthesis (3). Case stud- 
ies may be both longitudinal or cross sectional: they may present a 
picture of an individual at any one time, or they may represent a con- 
tinuous study over weeks or months. Usually a picture of growth, 
continuity, and development is added to the cross-sectional study (6); 
in order to make possible comparisons between conflicting evidence 
and to clarify the picture of the individual as it begins to emerge. 
The teacher supplements the data gathered in the classroom about 
an individual's modes of behavior, interaction, and progress with in- 
formation from parents, guidance personnel, medical officers, and 
other teachers. Cumulative records may become a part of the informa- 
tion to be included in the case study (8). 

The definition of the case study clearly embraces the phase of 
treatment. Traxler (7) points out, however, that whether treatment is 
recorded as a part of the case study will depend on the type of prob- 
lem sudied. Since the range of problems and difficulties accessible to 
study by this method is considerable, generalizations are difficult. 


Case Studies 217 


Certain problems are referred to a specialist, such as a physician, 
psychologist, or social worker. Most other problems, including learn- 
ing difficulties, are treated within the school setting. Sometimes a par- 
ticular child is studied because the teacher feels that he has not utilized 
all of his potential abilities and talents. In those instances, the progress 
made after treatment or special work becomes a part of the case record. 

The case study emphasizes process rather than product (4). Stand- 
ardized tests will measure whether a student has retained a certain 
bit of information, but they will not provide clues as to why he was 
unable to comprehend or retain it. An accurate description of the na- 
ture of a child's difficulties in comprehending can result only from 
observation by the teacher, verbalization by the child, and many other 
types of information. The same, of course, holds true for problems of 


cooperation and adjustment. 


CHILDREN STUDIED BY THE CASE STUDY TECHNIQUE 


Children with serious difficulties in adjusting to home or school 
Social situations, as well as children manifesting serious difficulties in 
scholarship, are usually studied by the case study technique. Shy chil- 
dren whose social and intellectual qualities seem possible of develop- 
ment are very fruitful subjects for study. In general, the decision is to 
study pupils who deviate from the average in any important repect. It 
may be argued that, if time and qualified staff permitted, all students 
should have records comparable to those of a case study. However, it 
is not necessary to apply this method to students whose behavior and 
other problems can be dealt with by the classroom teachers who un- 
derstand them without additional observation and study. 

Although the definition of the case study implies that it deals with 
one individual at one time, observation of groups can be used as part 
of the data for the case study. Traxler (7) lists several points regard- 
ing choice of subjects for the case study which are important for those 
attempting to make a case study for the first time: 


“Select a case in which you are really interested, both from the 
standpoint of the nature of the case and the personality of the in- 
dividual concerned. 

“If possible, choose a pupil from your own classes who, you feel, 
needs attention and help and who will probably cooperate well 
with you. 

“When considering various pupils, give some thought to the shy, 
quiet, retiring pupils. Pupils of this type are sometimes more suit- 
able subjects for case study than pupils whose difficulties, or be- 
havior causes them to be noticed. 


218 


Major Evaluation Techniques 


“Plan only as much as you feel you can accomplish. If you contem- 
plate a thorough case study, including treatment, it will probably 
be best in the first year to confine yourself to one pupil." 


PROCEDURES IN A CASE STUDY TECHNIQUE 


Most outlines for case studies seek information of the following 
kind, to be gathered by the individual undertaking the case study with 
the cooperation of all those who have contact with the child studied 


(2, 5, 8): 


a. 


Reasons for studying the particular child. A description 
of the special problem or an account of the incident which 
initiated the case study would serve this purpose. 


. Identification. Certain factual information must be pre- 


sented regarding the child, such as name, address, sex, and 
grade. Some teachers may feel that information about the 
child's ethnic group and economic status is pertinent. 
Family background. Data regarding the home environment 
of the child are necessary to provide a total picture. How 
many persons are living in the home? What is their rela- 
tionship to each other? What is their occupational and so- 
cial status? How much privacy does the child have? What 
are the parental methods of discipline? What are the par- 
ents’ attitudes toward the child? All these are questions 
which need answers. 


- Health record. This includes data on growth, weight, phys- 


iological maturity, peculiarities of gait and speech, physi- 
cal defects, nutritional status, and other evidences of the 
child's physical well-being or deficiency. 

Educational data. In this category are included cumula- 
tive record and attendance data, special academic profi- 
ciencies and deficiencies, information on the extent of par- 
ticipation in classroom and extra-curricular activities, and 
data on the child's mental ability and achievement which 
have been gathered from standardized and informal tests. 
Social behavior and interests. Observations by teachers, 
counselors, and parents regarding the child's social behav- 
ior and relations, and his interests and attitudes are in- 
cluded. For instance, what are his work habits, his fears, 
his likes, and dislikes? How much sex information does he 
have and what are his sex habits? Is he able to get along 
with others? Is his disposition cheerful or moody? Is there 
evidence of emotional stress? Does he seem self-confident? 
Long range observations in these areas give basic clues for 
later evaluation and interpretation of the case material. 


Case Studies 219 


There is some disagreement about how intensive the case study 
should be. Symonds (6) believes that if there are specific symptoms, 
such as reading difficulty, the case study need not be exhaustive but 
may proceed at once to the particular functions involved. Other work- 
ers in this area (5, 8, 2) feel that even in a specialized case of this 
type, it is desirable to obtain a complete picture, maintaining that a 
particular difficulty can best be interpreted against the background of 
the whole personality of the child. The advocates of this view hold 
that too much cannot be learned about a child, his life, habits, and 
complete environment. 


SAMPLE OF CLINIC CASE STUDY 


'The case study presented below as a sample was taken from the 
files of an educational clinic. Case material of school pupils taken, in- 
terpreted, and used for treatment within the school setting may differ 


from this in minor details. 


REPORT ON GEORGE C. 


Born: February 2, 1947 Age: 8 yrs. 3 mos. Grade: 8 


Visited Clinic May 12th and 22nd, 1955 


ed to the Clinic for investigation of the reasons for 


George was referr 
n reading and arithmetic. 


his very poor academic work, especially i 


General Impressions 


George was accompanied on his visit to the Clinic by his father. He had no 
idea why he had come, and was so nervous at first that he was visibly shak- 
ing. After explanation of the reason for the visit and a short period of 
getting acquainted, he was more at ease. However, he never seemed com- 


pletely comfortable. 


He showed a marked desire 
effort and attention and persevering in tas 
very difficult. He was a little disorganized 
important. At times he showed good planning, 
to an inferior trial and error procedure. 

He spoke in a thin, high-pitched voice, with a definite lisp. He answered 
questions readily and with apparent frankness, but volunteered few spon- 
taneous comments. In general, he seemed somewhat immature socially for 


his age. 


to do well on the various tests, showing good 
ks even when he found them 
by tests in which speed was 
and at other times he resorted 


220 Major Evaluation Techniques 


Physical Findings 


Medical examination indicates that George is a very well developed boy in 
good health. He is seven inches above average for his age in height and 
weighs 16 pounds above the average for his height and age; he is well 
nourished without being overweight. Vision is normal. Hearing as measured 
by the Maico D-5 Audiometer is borderline, with a loss of 16.8% in the 
right ear and a loss of 20% in the left ear. He has a short frenum, which 
probably is the cause of his lisp. No other defects were found. 


In tests of eye dominance, he shows no preference for either eye. He writes, 
draws, and does most other things with his left hand. In working with con- 
crete materials, he uses both hands about equally, and in simultaneous writ- 
ing he reverses as much with one hand as with the other. The left hand 
dominance is incomplete and rather weak. 


Abilities 
Because of a foreign language handicap, George's present mental age on the 
Revised Stanford-Binet Intelligence Scale, Form L, of 7 years 10 months 
and his IQ of 95, are considered to represent a minimum rather than a true 


measure. On the scale he passes all items at the seven year level; of the five 
items passed above this level, three are tests of auditory memory. 


His vocabulary is limited, failing at the 8 year level. On a series of non- 
language tests involving perception of form and physical relationships his 
work ranges from very superior to very inferior, with a median score of 
8 to 9 year level. His drawing of a man, scored for ideational content, is 
above average for his age. There is on record an IQ of 99 obtained on the 
Pintner-Cunningham Test in June 1953. The indications are that George 
is a boy of at least average intelligence, whose scores on verbal tests may 
improve as he learns more English. His present mental ability on predomi- 
nantly verbal tests is equal to that of an average child in the third grade. 


George was given several tests from the Stanford Achievement Test, Primary 
Battery, Form D. His grade scores were as follows: Paragraph Meaning, 3.0; 
Word Meaning, 2.5; Spelling, 3.8; Arithmetic Computation, 8.8. The spell- 
ing and arithmetic scores are up to his present grade, the second half of 
the third, and seem to indicate average performance in these subjects. 


His problem in reading is mainly one of lack of knowledge of word mean- 
ings. His oral reading is quite accurate and shows fairly good phrasing and 
very good ability at working out the pronunciations of unknown words. He 
can pronounce many words whose meaning he does not know. His retarda- 
tion in reading seems to be a mild one and one which should clear up as 
he acquires a richer English vocabulary. In silent reading, he seems to nee! 
more complete explanation of directions than most children do. His com- 
prehension in silent reading is at as high a level as his verbal comprehension 
on the intelligence tests. 


George comes from a home in which Spanish is spoken. His mother speaks 
very little English; his father's English is fairly fluent, but spoken with a 
marked accent and faulty grammar. George is said to have known very 


Case Studies 221 


little English when he entered the first grade. His vocabulary weakness 
seems to stem from this. 


School Adjustment 


Mr. C. says that he was not aware that George's work was deficient until 
last term, when he was called to school to discuss it. He then asked that 
George be allowed to take his books home, and since then has been super- 
vising his son's homework. George's report card for last term shows marked 
improvement during the term in reading, spelling, and arithmetic. It seems 
as if his scholastic difficulties were to a large extent overcome before he 
came to the Clinic. 


Social and Emotional Adjustment 


George's father was well dressed and gave the impression of having better 
than average intelligence. He is unable to work because of chronic illness 
and therefore has much more time to spend with his son than most fathers 
do. It is quite obvious that Mr. C. is very fond of this son. He admits no 
faults in the boy, denies that George has any social difficulties, and asserts 
that the scholastic problem no longer exists since he started to pay attention 
to George's homework. 


George gives, as mentioned earlier in this report, an impression of imma- 
turity in his speech and general manner. He plays with a group of boys and 
girls ranging in age from 6 to 8. His wishes and interests are quite normal. 
He is afraid of cars and of big boys, who throw stones when the little boys 
make faces. He seems to have a marked dislike for his sister, who is a year 
older; he thinks she is too bossy and takes his things. He likes his present 
teacher and thinks that he is getting along well now. Except for his hostility 
toward his sister, nothing significant of maladjustment came out of the in- 
terview. His drawing of a man is more like a typical girl's drawing than 
like a boy's; the head is very large and drawn with considerable attention 
to detail, while the arms and legs are small and poorly drawn. This is addi- 
tional evidence of emotional immaturity. 


The fact that for several terms George has been marked unsatisfactory on 
the traits "Works and Plays with Others" and "Respects the Rights of 
Others" indicates that difficulties exist which the father denies. It is probable 
that George is accustomed to being a little king at home; he resents having 
to share things with his sister, and carries his egocentric habits over into 


the school situation. 
Interpretation 


George is a boy of average or better than average intelligence whose prog- 
ress in school has been retarded somewhat by a Spanish-speaking home. His 
Scholastic adjustment has improved markedly since the request for ex- 
amination was made. His incomplete left dominance does not seem to have 
anything to do with his difficulties in school The personality difficulties 
noted in school probably result from spoiling at home by his father. Insuffi- 
cient evidence about this was obtained to allow more definite conclusions. 
No evidence of severe emotional disturbances was found. 


222 Major Evaluation Techniques 


Recommendations 


1. George's scholastic difficulties seem to have been already taken care of in 
large measure. If he should fall behind again, it would be worthwhile to 
discuss his difficulties with Mr. C. who is very anxious to do anything 
that will help George. 

2. In talking with Mr. C., a few suggestions were made: 


a. that so far as possible he should expect George to do his homework by 
himself, and not do it for him; 


b. that in other ways he should give George opportunities to become 
more self-reliant and less dependent upon his parents; 


c. that he should try to find out what truth there might be in George's 
complaints about his sister; 

d. that clipping of the frenum, easily done in a hospital clinic, might 
improve the clarity of George's speech. 

3. In regard to George's unsatisfactory personality traits as evidenced in 
school, not enough was found out to warrant specific recommendations. 
The general suggestion that he should be commended when he behaves 
in a more approved fashion may, perhaps, not be out of place. It seems 


probable that there should be improvement in conduct as his scholarship 
improves. 


INTERPRETING THE DATA IN A CASE STUDY 


After the data have been collected, it is advisable to present the 
material in an objective, simple form, with little bias (8). Description 
should be kept separate from interpretation. The material may be 
confined to a few pages. The final interpretation of the data and the 
decision as to the kind of remedial effort or treatment indicated is the 
joint concern of all those most closely in contact with the child. They 
enlist the guidance of an examining psychologist or counselor, who 
helps in the analysis of the findings and assists in the formulation of 
recommendations. The meeting of the individuals involved in the case 
study is called the case conference. Some of the following persons may 
be present at a case conference: the principal, the teacher or teachers 
of the student being studied, the counselor or guidance person, the 
school physician or school nurse (1). Social case workers, probation 
officers, and welfare officers are included if the problem of the student 
warrants their presence. 

Case conferences represent learning experiences for teachers, and 
are valuable for an understanding of all the children in the school, 
though they may be called for the purpose of considering specific be- 
havior problems. The case conference sensitizes teachers to the inter- 
relationship between physical, intellectual, and emotional factors. 


Case Studies 223 


As the recommendations made in the case conference are carried out, 
additional observations are necessary to evaluate whether the con- 
templated adjustments or changes have indeed taken place. Some- 
times the changes can be measured by standardized tests and 
measurements. At other times, several months of additional observa- 
tions are necessary to verify whether the interpretation given was 
correct and the remedial efforts deriving from the interpretation have 
led to improvement or adjustment. 


Summary 


The case study represents an analysis and synthesis of an individ- 
ual’s physical and psychological assets and liabilities by school, medi- 
cal, and psychological personnel. The purpose of the case study is to 
achieve better adjustment of the individual by means of a compre- 
hensive appraisal of all factors contributing to the problem under 
study and to institute remedial activities. Children with serious diffi- 
culties in adjusting to home or school situations are usually studied by 
means of this technique. 

The case study contains a description of the behavior indicative of 
maladjustment and data from observation and examinations of the 
individual's health, family background, mental capacity, educational 
achievements, and personal or social characteristics. The interpreta- 
tion of the data and recommendations for remedial action are the 
joint responsibility of all those who have studied and who have worked 
with the child. Subsequent observations are made to verify whether 
the interpretation of data was correct and the remedial efforts have 


resulted in improvement. 


Problems for Class Discussion 


l. Prepare a short and simplified case study of a pupil with problems, in- 
cluding as much available data as you can obtain for: (a) reasons for 
study, (b) identification data, (c) family background, (d) health record, 
(e) educational data, (f) social behavior, and (g) interests. 

2. Prepare a case study of a pupil with a problem of retardation in reading. 
On the basis of this case study make (a) a diagnosis of causative factors, 


(b) a plan for working with the pupil. 


References Cited. in This Chapter 


1. Crow, L. D., and Crow, À., Mental Hygiene in Schools and Home Life. 
New York: McGraw-Hill Book Co., 1942. 


224 Major Evaluation Techniques 


2. Division of Research and Guidance, Los Angeles County, Guidance 
Handbook for Secondary Schools. Los Angeles: California Test Bureau, 
1948. 

8. Good, C. V., Barr, A. S., and Scates, D. E., The Methodology of Educa- 
tional Research. New York: Appleton Century Co., 1936. 

4. Olson, W. C., "The Case Method," Review of Educational Research, 
9:488—490, December, 1939. 

5. Strang, R., Counseling Techniques in College and Secondary School. 
Chapter 30. New York: Harper & Brothers, 1937. 


6. Symonds, P. M., Diagnosing Personality and Conduct. New York: Cen- 
tury Co., 1931. 


7. Traxler, A. E., “Case Study Procedures in Guidance,” School Review, 
46:602-610, 1938. 


8. Traxler, A. E., Techniques of Guidance, p. 284-307. New York: Harper 
& Brothers, 1945. 


References for Further Reading 


Strang, R., Counseling Techniques in College and Secondary School. Chap- 
ter 30. New York: Harper & Brothers, 1937. 


In Chapter 30 of this volume, the author illustrates how the case study 
may be applied in the counseling of students. 


Traxler, A. E., Techniques of Guidance, p. 284-807. New York: Harper & 
Brothers, 1945. 
This reference provides a comprehensive statement of the definition, 


procedures, and uses of the case study. It is especially valuable for class- 
room teachers, as well as guidance counselors. 


cHaprer THIRTEEN | Cumulative Records 


The cumulative record of the individual pupil is usu- 
ally regarded as a permanent and official summary of his educational 
history. This record may be in the form of a card, folder, or booklet. 
The cumulative record may be used as a technique of appraisal and 
guidance for the following purposes: (a) to enable teachers to get 
acquainted quickly with new pupils, (b) to identify scholastic 
strengths and weaknesses of individual pupils and to plan a program 
to fit the pupil, (c) to identify problems of personal or social adjust- 


ment that a pupil faces, (d) to provide comprehensive and continuous 
data for counseling with the pupil, and (e) to serve as a source of 
information in a conference with a parent. The record also serves as 
the basis of reports to other schools, colleges, or prospective employers. 
In the modern school the pupil’s progress and growth is summarized 
on the school’s cumulative recorc. The entries on such a cumulative 
record are related as closely as possible to the major objectives of the 
School. As indicated earlier in this book, these objectives involve not 
only academic achievement and progress, but also physical health, 
Social adjustment, emotional stability, interests, aptitudes, and atti- 
tudes, The cumulative record serves as à comprehensive measure of the 
effectiveness of the school's service in guiding the growth of the pupil. 
The periodic, concise recording of all aspects of a pupil's career in 
and out of school challenges the school to consider the individuality of 
its pupils. It further makes possible a progressive evaluation of the 
School's services to each individual. 
Cumulative Record in Evaluation 
cords reveals many of the factors 
the growth and development of 


Place of the 


A careful study of cu 
Which have facilitated or ret? 


mulative re 
rded 
225 


226 Major Evaluation Techniques 


individual pupils. The data obtained from the records may be used 
in many ways. They can result in greatly improving guidance of the 
pupils program of activities and studies. They can indicate to the 
teacher in a cumulative fashion the persistent strengths and weak- 
nesses of the individual child. They can give a history of the pupil's 
progress from kindergarten through high school. 

In education today it is felt that the school program should be 
adapted to the abilities, interests, and needs of the child. It is, there- 
fore, essential that a systematic and comprehensive record be used to 
give the teacher as complete an understanding of each child as pos- 
sible, his past experiences, his home and environmental interrelations, 
and his mental, emotional, social, and physical development. It is thus 


clear that cumulative records are indispensable as a long range method 
of evaluating pupil growth. 


DEFINITION OF THE CUMULATIVE RECORD 


The cumulative record is a method for recording, filing, and using 
information essential to the guidance of students (3). It represents a 
cumulative individual inventory of the educational, mental, physical, 
social, and emotional development of each pupil. The record should be 
as complete as possible and truly cumulative. Information in these 
areas is generally recorded on the cumulative record: 


- identification of the student. 

. family and cultural background. 
physical and medical history. 

. marks in school subjects. 

. mental and achievement test data. 

extracurricular activities. 

. interests. 

. special talents. 


So r0 =O oD 


It is advisable to start the cumulative record at the time of school 
entrance. Of course, not all kinds of information can immediately be 
obtained or recorded. The development of the inventory is a gradual 
process. It has been suggested that each proposed entry should be 
subjected to the question: “What contribution will this item make 
toward the diagnosis of the child’s interests, capacities, aptitudes, limi- 
tations, and vocational possibilities?” (4) The judgment of the teacher 
will separate the wheat from the chaff, will select pertinent and sig- 
nificant data, and weed out the unimportant and irrelevant. It is ad- 
visable to keep interpretation and opinion separate from objective 
description in entering data on the cumulative record. 


Cumulative Records cd 


TYPES OF CUMULATIVE RECORDS 


Cumulative records may be kept in the forms of cards, booklets, or 
folders. There has been a tendency to move toward the folder type of 
record, which seems the most comprehensive and flexible of the three. 
Regularly recorded data may be entered on forms printed on the 
folder proper. Special records, such as health charts, anecdotes, and 
confidential information, may be placed on separate forms and inserted 
in the folder. 


Card The cumulative record developed by the Educational Records 
Bureau, for example, is a card nine by eleven and a half inches (5). 
It is used with a supplementary card of the same size, which gives a 
cumulative graphic record of test percentiles and school marks. Many 
features of the card devised by the Educational Records Bureau have 
their origin in the American Council on Education form for secondary 
schools, 

The cumulative record card of the Educational Records Bureau 
begins with a number of items concerning the personal background 
of the pupil, such as nationality and family background data, includ- 
ing the occupation of the father or mother, This is followed by a 
record of the pupil’s participation in eight activities, with spaces left 
for adding other activities. Some of the activities listed are: use of 
leisure time, music, athletics, summer activities, and travel. Below this 
there is a record of general health, listing height, weight, vision, hear- 
ing, and physical limitations. There is a space for recording home in- 
fluences, Another section of the record deals with personality, which 
is classified in terms of traits such as reliability, industry, initiative, 
emotional adjustment, and social adjustment. On the side of the card 
ample space is provided for each teacher to make comments appro- 
priate to the items listed under activities and personality. At the bot- 
tom of the face side of the card there is space in which may be 
summarized comment under the heading “Comments at Time of 
Transfer,” 

The back of the card is reserved for the recor 
achievement, Half of the card is taken up by 
marks, and credits. Under each year there are columns for three mark- 
ing periods, a mark for the year, rank in class, and periods per week 
that the subject was studied. The other half of the reverse side of the 
card lists scores on standard achievement tests taken by the pupil. 
The supplementary graphic record follows a pattern set by the graph 
of the American Council Cumulative Record Folder. 


d of scholarship and 
a table for subjects, 


228 Major Evaluation Techniques 


Booklet The booklet type of record is illustrated by the Cooperative 
Educational Record of the Denver Public Schools (2). This record is 
kept cooperatively by teachers, parents, and pupils. The booklet for 
the Denver secondary schools is eight and a half by eleven inches in 
size, consists of forty-seven pages, and is in part devoted to explaining 
the purpose of the cooperative educational record and giving direc- 
tions to pupil, teacher, and parent in making entries, The major part 
of the data is recorded by the student. There are seven sections in 
which information is noted. Section One deals with informal educa- 
tional experiences and interests, and permits recording of experiences 
under such headings as participation in clubs and other organizations, 
offices and positions of leadership, and vocational experiences. Sec- 
tion Two gives the record of formal educational experiences and 
interests, and lists subjects, grades, hours, and marks. Section Three 
contains the record of standardized tests. Section Four provides data 
on patterns of behavior and is in the nature of a self-descriptive inven- 
tory. The pupil rates himself on judgment, self-direction, organization, 
resourcefulness, leadership, acceptance of facts, cooperation, and 
similar characteristics. Instructions for evaluating are given to the 
pupil in the booklet. Section Five gives personal and home informa- 
tion. Section Six consists of the pupil's reading record. Section Seven, 
for informal observations and information, provides ample space for 
comments by parents, teachers, and the pupil himself. 


Folder The California Cumulative Guidance Record, which may be 
cited as an example, is designed to meet a variety of local school 
needs. It consists of one Basic Record Folder and three or more sup- 
plements (1). Two of the supplements make possible a more detailed 
presentation of data recorded on the basic record folder. Additional 
supplements devised by local school systems can be inserted into the 
folder. The basic record folder is approximately nine by twelve inches 
in size and consists of four divisions: identification, home environment, 
personality development, and school experiences. If no basic supple- 
ments are used, data on health, physical, social and emotional de- 
velopment are to be recorded on the basic record folder. Basic 
Supplement One provides opportunity for elaboration of data on 
health and physical development, summarizing the results of health 
examinations, immunizations, and the like. Basic Supplement Two 
deals with curricular experiences and permits recording of units of 
study which gave the pupil significant experiences. Basic Supplement 
Three, for adjustment factors, lists items such as study habits, educa- 
tional plans, educational programs, and vocational interests. 


Cumulative Records E 


INTERPRETING DATA ON THE CUMULATIVE RECORD 


Each entry on the cumulative record makes a contribution toward 
the diagnosis of a pupil's interests, capacities, aptitudes, limitations, 
and vocational possibilities. Data in each of the following areas make 
à specific contribution in assisting in interpretation. 


Family and Cultural Background 


Information recorded under this heading is varied, ranging from 
factual data concerning such items as nationality and. occupation of 
parents to information about the kind of emotional relationships exist- 
ing in the home. Not all of the information listed here may greatly 
assist the teacher or guidance worker in understanding the child. Data 
On economic status, for instance, may not be significant by itself. A 
child from either an impoverished or a luxurious home may enjoy a 
close and warm relationship with parents and siblings, making him 
àn emotionally secure individual. It is the quality of the interpersonal 
relationships in the home that it is important to know for an under- 
Standing of a pupil's emotional retardation or progress. Other data, 
Such as that on ethnic origin, may help to gauge the child's needs in 
certain respects. 


Physical and Medical History 


The results of health examinations and tests and the report on the 
Physical condition of the child are of great importance. Frequently, 
many of the problems noted in other parts of the cumulative record 
may be traced to poor health. Social and emotional behavior, as well 
as scholastic achievement and progress, are directly related to the 
Physical functioning of the child. Physical limitations must also be 
taken into account in vocational guidance. 


Marks in School Subjects 


School marks are more than ratings of achievement. They are in 
addition, a reflection of vivaciousness, cooperativeness, and talkative- 
ness (4), and contain elements related to later success. It has been 
pointed out, however, that marks provide a limited understanding of 
Capacities and aptitudes. They represent judgments reflecting varying 
Standards. Poor marks, for instance, may indicate lack of motivation, 
not lack of scholastic ability. Marks are most valuable in interpreta- 
tion when considered in conjunction with other items on the cumula- 
tive record. 


230 Major Evaluation Techniques 


Mental and Achievement Test Data 


Mental tests are an index of the potentiality of a pupil, whereas 
achievement tests measure what an individual has achieved in a sub- 
ject and what he may continue to achieve in that subject in the future. 
Data on mental and achievement test scores may be considered to- 
gether. A pupil who does well on the mental tests but consistently 
does poorly on achievement tests needs guidance and individual at- 
tention which will search for the reasons for the discrepancy between 
potentiality and performance. 


Extracurricular Activities 


The kinds of extracurricular activities in which a pupil spontane- 
ously engages are an important indicator of real interests and abilities, 
since such activities, being undertaken without pressure, are very 
meaningful to the pupil. Data about the nature, variety, and duration 
of extracurricular participation often give clues as to future occupa- 
tional interests. Information in this area can also be utilized in pro- 
viding added curricular experiences. 


Interests 


Interests emerge as pupils become exposed to many and varied 
experiences. As pupils are acquainted with art, with industry, with 
communal and civic life, with various occupations and hobbies, their 
interests become gradually defined. Data on the cumulative record 
will reveal this gradual focusing of interests, as out of many experi- 
ences a few are chosen for repetition. This information is valuable for 
guidance, becoming more important as the student goes through 
school. It must, however, be kept in mind that some interests reflect 
both inner and environmental circumstances and it is quite possible 
that they may suddenly shift. 


Special Talents 


Data about special talents are closely related to data about interests. 
It is important to note all evidences of special talents and abilities as 
well as special handicaps and limitations. These are revealed through 
observation of hobbies, interests, work records, and anecdotal material 
(4). Tests of special aptitudes are of great value. Any evidence of 
special talent in music or the fine arts, as well as evidence of special 
mechanical and manual aptitude, is important both for guidance and 
for planning activities for the individual pupil. 


Cumulative Records 2mm 


ILLUSTRATION OF A CUMULATIVE RECORD 


On the adjoining pages, illustrations of sections of the cumulative 
record used in the New York City Schools are presented. The New 
York City cumulative record is of the folder type. The sections on the 
main, or basic, folder for personal and educational data include the 
following: (a) personal information, (b) registration and attendance 
data, (c) subject growth record, (d) personality scale, and (e) guid- 
ance information. Separate inserts, or supplements, include the pupil 
health record form and the test record for results of individual and 
Broup tests. 

In the section on personal information, all identification data about 
date of birth, parents' names, family data, language in the home, and 
home address are found. This section also includes registration and 
attendance data from kindergarten through grade six. 

The subject growth record section of the cumulative record shows 
teacher rating of achievement in such subjects as reading, language 
arts, social studies, science, mathematics, health, music, and the fine 
arts. The ratings used are as follows: S indicates satisfactory achieve- 
ment in terms of capacity; I indicates need for improvement in terms 
Of capacity; and U indicates unsatisfactory achievement in terms of 
capacity. 

The personality scale and the guidance data are shown in other sec- 
tions of the cumulative record. 

The insert for individual and group test results includes a record 
of results of intelligence and achievement tests. 


THE UNIQUE ROLE OF THE CUMULATIVE RECORD IN EVALUATION 


f evaluating pupil growth and develop- 
ment the cumulative record plays a unique role. The study, observa- 
tion, evaluation, and testing of one individual over a period of time 
resembles the genetic approach in the fields of biology and clinical 
Psychology. Ruch and Segel (4) have formulated two generalizations 
Which state succinctly the advantages of the cumulative record. One is 
that "the predictive value of an item is increased when data on that 
item are gathered annually or periodically over a range of years." 
This means that if a certain comment, rating, or score shows up on 
the record time after time we can place more faith in it as reflecting 
the personality or aptitude of a pupil than we can if there is only one 
entry, 

Their second generalization is equally valuable: “A number of re- 
lated items usually give a better prediction than any one single item, 


As a long range method o 


Major Evaluation Techniques 


232 


7cowas Mi SuvaA 40 ON 


Nosvau 


rr 


Iava 


aaasnyi 


or 


mJ 
uo 


31v1 s3wii 


iNassv sava 


BONVONSLLY 


AN3S3Md SAY 


S3YILINI 


S,u3H2V31 SSY13 Yi21440 
Ssv12 a3u31N3 31va 
ssvo 


H9nowos ONY 300825 


OL ANIS SYM 
Quo23H 4O AdO2. 


TONYS WIIN 3367 


348NYML 


mivolanii2 
ayaa CI "QVu9 NON 


Q3HHLIA 
aainyuo 
VOTAN 


RONJ O3UJIN3 


"uel 


5100H25 ALID HOA MIN 
O3831N3 1Su|4 31vG 


mon 


ZEN 


321440 1504 ONY 133H15 'u30WnN "oucn 


mo 
i 


aped 


321440 1504 ONY 133418 "30NnN "oua 


NOILVHUOANI ATINVS ava 
SUaisis| d 
s| ava EET] NviQuvno JO 3MYN 
susnioua|8 
GUBISIS| o 
R| anon m aovnonva aaia avaa 32v14MIMIB S,N3H1ON BWYN N3GIVMW S.u3HION 
Suanioual $ 
a om UY3A | nonuvuissvA do usvaa aaia Hv34 32v14MIMIG Su3H1VA AWN S,u3Hiv4 
LIOA MaN JO Ai5-—Nolivonaa 40 Quvor[ov 4o 4ooud| Hiig do FOWId Hinia 4O 3ivd SHEN amao]; 3WVN 1SHid (ANKE) SAWN isvi 
viva ByaA | ava | OW 


aN! 


joltvonas_ ONY _TYNOSuad 
O40238 3AuVinWn2 


plosay eAn?[numrp) v jo suonoog qoi) josqng pue uoneuuoju[ [puOslod 9 sor 


233 


« ) siuv 3wvüuisnawr 


BUYS Gowasnow 
ONY 50004 


S3iwüva Gionasnon 
ONY 9NIMIO12 


NO23 
HOH 


DNIAIT WID0S 


Driva avuanao 


NOISS3u 4X3 
ENTE 
— 


s1uv ania 


VUA 40 asn 


ONILVE "vua 
= 


32NVWH04H34 
—j— 


isan 


NoILVIDauddy 


SONIMUVAT HIVIN 


0woisAWd A319380 


LT | 


mavan 


cemer 
Wa 
as a ua 


NOlLYNIGUOOS 
WOlOM 


211v wauivw 


Hive 


$1435N05 
TWO WAHIVM 


393128 


saans ?vi20$ 


ONILVY vua 
(9 € z xovuo univuxir1) 
NOILVI23UddY. 


NOILV2INnWWO2 MY40 
| j Em —E 


'DNILIBMONYH 
— 


| L. |l L —À 


Cumulative Records 


51uv a0vnouv3 


] NOILVOINDRHOS 
1 


NILUM 


9Niavau 


ZI E 
pen Nr DNINNIO3H YYIA 


Major Evaluation Techniques 


234 


TXXGWRHR Anaan XCINAnGawa Ul 


amisnrov 
COC | CIUSZGr3sn RTT S| enon 
: T TGSHURISIG Sraae siMeNOILVIaW GWCAN RWV U| OT 


CIUGWW GI GiWSHGNIVASS WI Sawa Gas v| diMaMOLv TAM 


TELGVW SHOAUSN 40 3304 AYAAMLVARM Siu] 


Crunia 100w2& MI dinsuIanan uosa) 
5un12 ^'oou2s 


siinvH. snonaN 
"ENOVA ENGANN SHOUT SVN V| 


THONSaWIG saumeau "a 
amavi 


-SNGURWIRINGS TVAIGING W3NYH AINInGAUA "V| 


THOHNIIIY GXMIVICnS HIM WXuOM | 


THOUNAIIV GXNIVISRS WIUM WXWOM ATIVERNIO "A 


iiQVH X07. 


TKUOW Gi SONBILY X33HVR | 


TEASVUISAM Oea ATIVAGA | 


i 


W35W3NA XTTVMWOM 0| — ALIAUDY 


"313 
AWONYN "30YnoNY3 "luv "ien TONILIUM 3&uv3u2 “Ba!aNOH) 
Sis3u31NI ONIQNViS1nO 


TSISWEWIi ARE V 


“inawavunosna, 
ANYAENOD $033M '3oW30i4MO2 BNIYTI “D| 


“EwanaoVvunOSNa ananoaua SOJIN u| 32NJOlANOD 
ams 


"33W3GIAWGS Wii SXWOM AYwvnen “V| 


TEIS3IOUS ANONO WI MAIS SXIVAIDRIUV | 


nonvasnuva 


^vi»os. 


TEIVISUMYd JON 9300 "NMYUGHIIM “ANE V 


TANINIHSAUA WINONA “GAIESRUDDTUBAD Sy 


"ZATSERUSOY AYARVMNGOR 7 


JZT3eRUW IW3SSY TON ESO CY 


ne TRISUDUIDO wadWad ININOIUA 2| 
5311 rtavsra onianvasıno VISSSRIRSOGSRXDONTGNYSSSOH pumavis 

[ | A "GXYIGUINO2 TIIM ATIVABA V| 
i | WOUNHIVISINSURY INNEN GATED] 
T | THGUHGIIV 49 GNHORY JIVWiGWONI S3uinoxu-v|  MO4 CAN 

| | T THWUNISNDORTNR OS 

| | TETHERGXENSESGRI-KTWERGIYSSO W| ALITGISHOMETL 
j TPUYGH343G ion AN TWnan ^ 
TT 


sanay 5NIQNYiS1nO 


“Suarivno GIMSUGY3T 539604 TON 330 > 


THGiSV3SG NO wagvai WiVa E|  eiMeuzavaT, 


7SWAMIG Aa G3i4333v wRGYAT GOOD v | 


TIGWINGO unous Oi 113 WONOdGiM 3 


30uN05. 
TYWINGS AGuS SINISIN ATIWNOIVSSO 78| anoub ONYMOL 


i— 


aaniniv 
TUNIMUGZNGZWOR AVWREG Y 


"SHIRIO HAIM UU ONOTY 139 10N $360 74 


x (7212 'S1VAGIAIONI "5nY3und 100425 '83/2N32Y WIDOS € MATVIH)) 
Siuo43M Avias 
D 
saitin vava a32oNvaino 


ELEREN 


pioooy əanepumg e jo suonoog eq morau Leq eourpms, 


Ssv15 


pur o[pog Ajyeuosioq 


Mawar uai 
“SWANS HIDMOVUAm SAVIA ONY SXUGACY | OF SINENOLLY TAY 


(X NY HIM dnOH HƏYA NI INO X23H2) 


ALITVNOSYad 


3u0973 


235 


Cumulative Records 


Weuvnavuo mann 


Ced Moons NOI MI mivMau TIM 


See 
T - 400428 MDIM M! MIYMIU IIM 


noonss MOIM Mi wivmam TUM 


"X NY HALIM Q3X23H2 3a TIVHS MONIA S1NJWJ1V19 IHL AO ANO ONILYNGVED NOE 


WaSOHD 5H 


358n03 


355no5 N35OH2 “SH 


31va 


WIMGIAWSINI SNY1d 100H9S H5IH 


wal 'USMIIU WIMIO UIONN SIMINI 


Asnivoy MIH "HOMGLIY IYISI4S PNIUINOIN &waTüOu4 TIV GODIN) 
SWITGOUd ONIGNYISINO 


mva 


YE (CqwaioWiva OL ININAOTAMA 2Y93T1i 1uOA3U "AIVSIMLUID INIH 
AOINA HIVA 40 W3XUMDM IML QWY OALIO3M AYA AIIM V SUNOM "HHO 4O ONIA 
|owooxu "BUNON TOODE NILAY INIWAOIMNA MYINDIA) —INIWAOTAWI JOlS1nO 


"ANIMAAOVWARI sazan uo 
ONIONVASANO St Sand MDIMM MI EMAANSY 10008 4O INO CNY MI A4IMOIE) 
S3HIAIi2Y QYHOILY3U23M 


Bava 


(Cuoian830 Nos ToNen 338) 
ANIWIATINDY ONY 2214439 ONIONYISInO 


(ais “ss1uOieiH ZEY "sOWO23M CITIYIIG HOS 413348 3ivuva3r 250) 
suHiO ONY SLN3UYd “STIdNd HLM SM3IAMJINI 4O,193r8n9 GHODIH 
wav MIIAYJINI 


gava |usmatAusiNt 


l- 


Major Evaluation Techniques 


aa |^ vm | ow aa [cor vw v» 

sny on [vt [Y |in ranon asxi [osx inoa | suwow en 
aveo | ouv. p »ovus| ov 

E Pod ELI 1531 40 3MYN 3o. | -sav0ss! 


we =a gk RO ee 
e [E eame | RS lumen. [9 ems à 
e 
N $11ns3H ASIL dnOH9 ANV IYNAIMONI 
'4HOA MAN 4O A1I2—NOILV2n03 40 auvoa Hania dO 3iva 3WYvN 31gGdilW 3WYN isula (ANIBd) 3WYN ASW 
viva 1saL 


auooay SALLVINWND 


ricunE 8 Individual and Group Test Results of a Cumulative Record 


Cumulative Records ot 


if properly weighted and combined.” They illustrate this point by 
stating that school marks, teacher ratings, and intelligence and achieve- 
ment tests give a better picture of all-round academic ability than 
any one item would. This is particularly true if observations and tests 
tap different aspects of the pupil's interests and personality at different 
times. 

A comprehensive cumulative record, well kept, is an important tool 
of an on-going process of evaluation, and it permits a continuous 
feed-back of information to teachers, guidance workers, parents and 
—last but not least—the pupil. 


Summary 


The cumulative record in the form of a card, folder, or booklet is a 
Permanent educational history of the individual pupil from his entry 
Into a school system until he leaves. The cumulative record contains 
fairly complete information about identification data for the pupil, 
his parents, and siblings; his school achievement, test results, course 
of study, attendance, health, personality, and similar pertinent data. 

The cumulative record serves a variety of administrative, supervis- 
ory, and instructional uses. It may be used as a technique of appraisal 
or guidance in counseling with the pupil, conferring with his parents, 
and transmitting reports to other schools or prospective employers. 

The unique role of the cumulative record in evaluation is that the 
Predictive value of data is increased when such data have been re- 
corded periodically. The review of a series of records on school 
achievement, test results, or personality ratings is more valid and 
reliable than any one single datum on a given pupil characteristic. In 
addition, the relationships among such items as intelligence, achieve- 
ment, and personality characteristics may be observed and woven into 
à comprehensive picture of the individual as he has grown and de- 
veloped. The cumulative record provides a continuous and coherent 
evaluation of the individual pupil's physical, mental, emotional, and 


Social development. 


Problems for Class Discussion 

l. Make a survey among five to ten teachers to determine the uses they 
make of the cumulative record. 

2. Study the cumulative record of a pupil. From this study, prepare a sum- 

mary statement of the scholastic and personal or social strengths and 


Weaknesses of the pupil. 


238 Major Evaluation Techniques 


8. Obtain cumulative records used in several school systems. Assess the 
values and limitations of each record form. Which would you prefer to 
use? 


References Cited in This Chapter 


1. California School Supervisors’ Association, California Cumulative Guid- 
ance Record for Elementary Schools. San Francisco: A. Carlisle and Com- 
pany, 1944. 


9. Denver Public Schools, Cooperative Educational Record. Denver: Board 
of Education, 1935. 

3. Division of Research and Guidance, Los Angeles County, Guidance Hand- 
book for Secondary Schools. Los Angeles: California Test Bureau, 1948. 

4. Ruch, G. M., and Segel, D., Minimum Essentials of the Individual In- 
ventory in Guidance. Washington, D. C.: United States Department of 
the Interior, Office of Education, 1939. 

5. Traxler, A. E., “A Cumulative Record Form for the Elementary School," 
Elementary School Journal, 40:45-54, September, 1939. 


References for Further Reading 


Segel, David, Nature and Use of the Cumulative Record. United States 
Office of Education, Bulletin No. 3. Washington, D. C.: Superintendent 
of Documents, 1938. 


An old but comprehensive and concise discussion on the nature and 
use of the cumulative record. 


Traxler, A. E., How to Use Cumulative Records. Chicago: Science Research 
Associates, 1947. 


This reference is especially valuable for the teacher, counselor, or 
supervisor because it tells simply and graphically the many uses of cumu- 
lative records. 


Evaluating 
part HRE f Major Objectives 
and Situations 


Evaluating Achievement 
CHAPTER FOURTEEN | in Language Arts 
and Mathematics 


Language arts and mathematics comprise the time- 
honored “3 R's"—reading, writing, and arithmetic—used as basic sym- 
bols and tools of communication in American culture. Verbal and 
mathematical symbols and processes are used whether an individual 
participates in a game of dominoes or in a complex industrial or busi- 
ness transaction. Since these symbols are so pervasive and basic to 
communication and learning, the roles of language arts and mathe- 
matics have loomed large in the curriculum of the common school. 
Modern educators recognize, however, that acquisition of skills and 
knowledge in these subjects should be supplemented by realization 
of such other objectives and concomitant learnings as power in rea- 
Soning, interests, attitudes, and appreciations. 


ACHIEVEMENT TESTS IN BASIC SKILLS 

The earliest objective tests constructed in 1910 and shortly there- 
after by Thorndike, Courtis, Monroe, and others emphasized achieve- 
ment in arithmetic and in the language arts of reading, writing, spell- 
ing, grammar, and usage. Since those early days, many technical re- 
finements have been made to improve the validity, reliability, norms, 
and interpretation of achievement tests. Scientific studies of curricu- 
lum content are now made so that test exercises provide a representa- 
tive sample of the important learnings. Each item is checked for sta- 
tistical as well as curricular validity. Reliability of tests has been im- 
Proved by new methods of analysis. Norms are now obtained by im- 
proved methods of selecting representative samples of pupils. Meth- 
ods of interpretation have been devised for diagnosis of test results 


and comparison with similar age or grade groups. 
241 


242 Evaluating Major Objectives and Situations 


Curricular Validity Curricular, or content, validity is of central im- 
portance in achievement tests. An achievement test in reading, spell- 
ing, or arithmetic has curricular validity if it measures fairly the ex- 
tent to which pupils have learned the curricular experiences provided 
in the school. In the language arts and mathematics courses of study 
and in various related series of textbooks, the curriculum content is 
fairly uniform throughout the nation so far as scope and nature of 
major objectives of instruction and grade placement of subject matter 
is concerned. To avoid invalidity attributable to improper sampling 
of curriculum content, national distributors of standardized tests make 
careful studies of representative courses of study and instructional 
materials used in the schools of the nation. A sampling plan is used 
to select test items that are equally significant for appraising all major 
abilities, knowledges, and skills in a course of study. 

Lindquist (7:121) discusses another trend in achievement testing in 
which he expresses the viewpoint that tests of immediate and specific 
objectives of instruction should be constructed locally to fit the local 
course of study. Tests intended for wide-scale use should emphasize 
comprehensive, general, and ultimate educational objectives in a cur- 
ricular area. These may appropriately be called tests of general edu- 
cational development, with exercises designed to measure growth in 
general abilities and skills rather than specific information and knowl- 
edge. Such tests permit variations in the Specific content or experi- 
ences provided in a local school situation so long as these contribute 
to such general or ultimate objectives as comprehension in reading, 


facility in written expression, and application of mathematical proc- 
esses and concepts. 


STANDARDIZED AND INFORMAL TESTS OF ACHIEVEMENT 


For convenience in discussion, the methods of testing achievement 
may be classified as (a) standard or formal tests and (b) informal or 
teacher-made tests. The standard tests for wide-scale use are valuable 
for measuring growth and development over long periods of time and 
for providing comparative data based upon such relatively uniform 
standard units of measurement as age, grade, percentile, and standard 
score norms. Such standard tests, however, constitute only a part of 
the total evaluation program for achievement. The remaining part in- 
volves the day-by-day evaluation, using informal or teacher-made, 
tests and techniques. These may include teacher-made tests using true- 
false, completion, or multiple-choice responses, and, on appropriate 
occasions, such methods as oral questions, anecdotal records, inter- 
views, or observations of pupils. Many suggestions for informal tests 


Evaluating Achievement in Language Arts and Mathematics 243 


and other methods of appraisal are provided in the Forty-Fifth Year- 
book of the National Society for the Study of Education (6). 

In this chapter, the scope of discussion is restricted to a brief sur- 
vey and listing of representative standard achievement tests in lan- 
guage arts and mathematics. Short statements regarding the objectives 
evaluated by the test are followed by a tabular presentation of illus- 
trative tests for the various subjects of instruction. Detailed discussion 
and presentation of sample items, involving the comprehensive enu- 
meration of test titles, is obviously outside the scope of this chapter. 
In a like manner, for the language arts and mathematics, tests of inter- 
ests, attitudes, and critical thinking are reserved for discussion in sep- 
arate chapters devoted to these objectives of instruction and learning. 


INFORMATION ABOUT STANDARD ACHIEVEMENT TESTS 


Since an inventory or listing of available tests cannot be made 
within the scope of this chapter, it may be helpful to indicate to the 
reader the most fruitful source that he may consult in order to obtain 
information and appraisal about standard tests of achievement. This 
source is the series of mental measurement yearbooks edited by Buros 
(1, 2, 8) and published periodically. The plan of these yearbooks is 
to provide several appraisals of each test by experts. These appraisals 
enable the reader to get a clear picture of the advantages and limita- 
tions of each test. Practically all standard tests are reviewed. This 
source of information should be consulted by anyone wishing a more 
detailed statement of the various tests which are listed in subsequent 


Sections of this chapter. 


Batteries and Series of Achievement Tests 


In achievement testing of basic skills and subjects, particularly at 
the elementary-school level, most test publishers and authors provide 
a battery of tests, bound in a single booklet, including the following 


content: silent reading, language usage skills, arithmetic computation, 


arithmetic problems, spelling, social studies, and elementary science. 
Generally these tests are also available in separate booklets. 

At the secondary-school level, there are few batteries of tests that 
are bound together into a single booklet. Generally, the authors and 
publishers have a series of tests on such basic subjects as English, for- 
eign languages, mathematics, science, and the social studies. Each 
test is printed in a separate booklet. Each of these separate tests meas- 
ures the content most frequently found in typical courses of study in 
the particular subject in representative high schools throughout the 


244 


Evaluating Major Objectives and. Situations 


nation. Table 3 lists some typical achievement test batteries at the 
elementary-school level and series of tests at the secondary-school 


level that are distributed nationally by commercial publishers. 


Illustrative Achievement Test 
TABLE 3 
Batteries or Series 


TEST AND PUBLISHER 


Elementary School 


California Achievement 
Tests—Primary, Elementary, 
and Intermediate 
(California Test Bureau) 


Iowa Every-Pupil Tests 

of Basic Skills—Elementary 
and Advanced 

(Houghton Mifflin 
Company) 


Metropolitan Achieve- 
ment Tests—Primary 1, 
Primary 2, Elementary, In- 
termediate, and Advanced 
(World Book Company) 


Modern School Achieve- 
ment Tests 

(Bureau of Publications, 
Teachers College, 
Columbia Univ.) 


National Achievement 
Tests, Municipal Battery 
(Acorn Publishing Co.) 


SRA Achievement Series 
—Intermediate 

(Science Research 
Associates) 


Stanford Achievement 
Tests—Primary, Elementary, 
Intermediate, and Advanced 
(World Book Company) 


DATE 


1950 


1940 


1947 


1944 


1940 


1954 


1952 


CONTENT 


Vocabulary, reading com- 
prehension, arithmetic rea- 
soning and fundamentals, 
language 


Reading, language, arith- 
metic, work-study skills 


Reading, vocabulary, arith- 
metic fundamentals and 
problems, language usage, 


spelling 

Reading comprehension 
and speed, arithmetic com- 
putation and reasoning, 


spelling, health knowledge, 
history, and civics, geog- 
raphy, elementary science 


Reading comprehension 
and speed, arithmetic fun- 
damentals and reasoning, 
English, literature, geogra- 
phy, history, civics, health 


Arithmetic, language arts, 
reading, work-study skills 


Paragraph meaning, word 
meaning, arithmetic reason- 
ing and computation, social 
studies, science, spelling, 
study skills 


GRADE 


4-6 


Evaluating Achievement in Language Arts and Mathematics 245 


TEST AND PUBLISHER DATE CONTENT GRADE 


Secondary School 


California Achievement 1950 Reading, arithmetic, lan- 9-12 
Tests—Advanced guage 

(California Test Bureau) 

Cooperative Test Series 1950 English, foreign language, 9-12, 
(Educational Testing et seq. mathematics, science, social College 
Service) studies, etc. 

Evaluation and Adjust- 1950 English, literature, science, 9-12 
ment Series of High et seq. mathematics, history, 

School Tests health, study skills, 

(World Book Company) etc. 

Iowa Tests of Educa- 1948 Basic social concepts, natu- 9-12, 


tional Development ral sciences, quantitative College 
(Science Research thinking, interpretation of 
Associates ) reading materials and lit- 

erary materials, vocabulary, 

use of sources of informa- 

tion, correctness of expres- 

sion 


en 


most of the achievement test bat- 
teries are valuable aids in the evaluation of growth of pupils. The 
Scores are comparable from subject to subject and among the different 
forms of each battery. Thus, they permit an analysis of the pupils’ 
strengths and weaknesses, especially in the basic skills, and provide 
a profile of the individual’s progress through his elementary-school 
career, The refined procedures used to select test content with cur- 
ricular and statistical validity, the standard methods of administra- 
tion and scoring, and the periodic revision of the tests and their 
norms enable the local schools to use the tests for a variety of in- 
Structional, guidance, and supervisory purposes. 


At the secondary-school level, tests of general educational develop- 


ment generally consist of exercises in which information is presented 


in verbal, graphic, or other form, and the test exercises are devised so 
as to measure the general abilities of the individual to comprehend 
and interpret the material presented. Achievement measured in this 
manner contrasts with test exercises which emphasize the recall or 
recognition of specific items of information. These tests measure the 
comprehensive objectives of instruction in the language arts, litera- 
ture, social studies, science, and mathematics. 


At the elementary-school level, 


246 Evaluating Major Objectives and Situations 


Language Arts 
OBJECTIVES OF INSTRUCTION IN THE LANGUAGE ARTS 


The scope of the language arts includes experiences and learning in 
reading, writing, speaking, and listening. Each of these broad fields 
may be further subdivided into the more specific objectives which, 
when defined in terms of learner behavior, make it possible to evalu- 
ate the progress and growth of the individual. 

In reading, some of the specific objectives which teachers may wish 
to evaluate are abilities and skills in: (a) reading readiness, (b) vo- 
cabulary or word meaning, (c) comprehension, (d) work-study skills, 
(e) reasoning, and (f) tastes and interests in literature. 

In writing, among the objectives evaluated are abilities and skills in: 
(a) penmanship, (b) spelling, (c) writing grammatically correct sen- 
tences, (d) using correct punctuation, and (e) organizing and ex- 
pressing ideas clearly and in an interesting manner. 

In speaking or oral communication, the objectives to be evaluated 
may include several skills common to readin 
(a) readiness, (b) vocabulary, (c) 
and (e) expressing ideas. 

In listening, the objective to be measured is chiefly comprehension. 
This, in turn, can be analyzed into such component factors as: (a) 


vocabulary, (b) interest, (c) visualizing organization, and (d) ability 
to make inferences. 


g and writing, such as: 
reasoning, (d) organizing ideas, 


EVALUATION OF READING 


The evaluation of reading involves not only the measurement of 
skills and abilities at the reading readiness stage of the kindergarten 
and first grade, but also the more complicated skills, abilities, inter- 
ests, attitudes, and appreciations at the high-school level. To measure 
the scope of abilities, interests, needs, and attitudes in reading, vari- 
ous reading tests and techniques have been devised, In this discussion, 
the category of literature is subsumed as a part of reading. Tests of 
work-study skills and library usage skills have also been included. 


Reading Readiness In order to judge the Pupil's readiness for read- 
ing, various tests and devices have been con 
teacher in evaluating whether the pupil has a 
success when he is introduced to formal reading. These tests, com- 
bined with other observations and judgments ; 


1 ; about the pupil's phys- 
ical, mental, emotional, and social development, Provide a guide for 


Structed to help the 
reasonable chance for 


Evaluating Achievement in Language Arts and. Mathematics 247 


evaluating achievement. Representative reading readiness tests are 
listed in Table 4. 


TABLE 4 Illustrative 
Reading Readiness Tests 


TEST AND PUBLISHER DATE CONTENT GRADE 


Gates Reading Readiness 1939 Picture directions, word 1 
Tests matching, flash card per- 


(Bureau of Publications, ception, letter and number 


Teachers College, reading, etc. 
Columbia Univ.) 
Lee-Clark Reading 1951 Letter matching, letter and l 


Readiness Test word discrimination 


(California Test Bureau) 


Metropolitan Readiness 1933 Picture matching, copying T 
Tests figures, vocabulary, picture 
(World Book Company) directions, etc. 

Monroe Reading Aptitude 1985 Visual tests, auditory tests, 1 
Tests motor control, oral speed, 
(Houghton Mifflin articulation, language 

Company) 

Murphy-Durrell 1949 Auditory discrimination, 1 


Diagnostic Reading visual discrimination, learn- 
Readiness Test ing rate 


are usually administered to small groups of 
than ten or fifteen individuals at a 
drawing a line connecting pictures, 
ying figures or letters; and mark- 


Reading readiness tests 
children, comprising not more 
time. The directions call for 
words, or letters that are alike; cop 
ing the drawing in a row of four or five drawings that represents the 
Word pronounced by the examiner, for example, the drawing of a 
pencil. Following are sample test exercises of visual discrimination, 


Which measure ability to match words that are alike. 


In each of the first two exercises the task of the pupil is 
to find and mark two words that look exactly alike. No refer- 
ence is made to the meaning of the words. In the third test, 
the pupil is asked to look at the word "cup" beside the pic- 
ture of the cup, and to draw a line on this word every time 
he sees it in a story in the exercise. By this picture-word- 


248 Evaluating Major Objectives and. Situations 


matching technique, he recognizes words in a meaningful 
situation. 


a.| boy toy boy dog box ion cup 


c.| I have a cup. 


This cup is white. 


I drink from this cup. 


In addition to group tests of reading readiness, the Betts Ready-to- 
Read Tests (distributed by Keystone View Co., Meadville, Penna.) 
are elaborate individual tests for assessing physical readiness for read- 
ing, visual perception, auditory perception, lateral dominance or 
sidedness of the hand, foot, and eye. The Betts-Keystone Telebinocu- 
lar apparatus is essentially a stereoscope that permits the measurement 


of visual acuity, astigmatism, eye muscle balance, and visual fusion 
when used with a standard Series of slides devised by Betts. 


Silent and Oral Reading Amon 
paper tests to measure 
and paragraphs was G 


g the first to construct pencil-and- 
primary reading abilities of Words, sentences, 
ates in his Tests of Comprehension of Words, 
Sentences, and Paragraphs. Comprehension is measured by the child's 
drawing a line from a word, sentence, or paragraph to the correct pic- 
torial illustration. For pupils above the first-grade level, such test bat- 
teries as the Stanford, Metropolitan, Modern School, and California 
include measures of vocabulary and paragraph comprehension, Gray's 
Standardized Oral Reading Paragraphs and Oral Reading Check Tests 
provide measures of oral reading abilities from about the first- or 
second-grade level through the elementary grades, 


Reading is an essential tool for the acquisition of concepts and in- 


formation in all areas of the curriculum. For this reason, it is some- 


‘PS Many important aspects of read- 
ing comprehension. The various items require the child not only to 


Evaluating Achievement in Language Arts and Mathematics 249 


get literal meanings and note direct details, but also to infer the 
meanings of words; to see, follow, and infer relationships; to weigh 
the significance of ideas; to make subtle interpretations; and to draw 
general conclusions about the content, tone, and purpose of the read- 
ing matter. 

Table 5 contains representative reading tests at the elementary- and 
Secondary-school levels. 


wink d Illustrative 
Reading Tests 
TEST AND PUBLISHER DATE CONTENT GRADS 
Elementary School 
California Reading Test— 1950 Reading vocabulary, read- 1-3, 
rimary, Elementary, and ing comprehension ir 


Intermediate 

(California Test Bureau) 
Durrell-Sullivan Reading 1937 Word meaning, paragraph 8-6 
Achievement Test— meaning 
Intermediate 

World Book Company) 
Gates Reading Tests— 1942 Word recognition, sentence 1-2 
Primary Reading reading, paragraph reading 
(Bureau of Publications, 
Teachers College, 

olumbia Univ.) 
Gates Reading Tests— 1942 Word recognition, para- 2-3 


Advanced Primary graph reading 


( nm of Publications, 
eachers College, 
Columbia Univ.) 


Gates Basic Reading Test 
Bureau of Publications, 
eachers College, 
olumbia Univ.) 


1942 Reading to appreciate gen- 8-8 
eral significance, to predict 
outcome of events, to un- 
derstand precise directions, 
to note details 

Towa Every-Pupil Test of 1940 Paragraph comprehension, 3-5, 

Basic Skills: Test A. Silent ability to grasp and under- 6-8 

eading Comprehension— stand details, organization 
mentary and Advanced of ideas, ability to appreci- 
Houghton Mifflin ate meaning, vocabulary 
ompany) 


250 


TEST AND PUBLISHER 


Metropolitan Achievement 
Test, Reading—Elementary, 
Intermediate, and Advanced 
(World Book Company) 
National Achievement 
Tests, Reading Compre- 
hension Test 

(Acorn Publishing 
Company) 

Nelson Silent Reading 
Test 

(Houghton Mifflin 
Company ) 


SRA Achievement Series— 
Intermediate: Reading 
(Science Research 
Associates) 

Stanford Achievement 

Test, Reading—Elementary, 
Intermediate, and Advanced 
(World Book Company) 


Secondary School 
California Reading Test, 
Advanced 

(California Test Bureau) 
Cooperative English Test; 
Test C. Reading Compre- 
hension—Test C1, Lower 
Level, and Test C2, Higher 
Level 

(Educational Testing 
Service) 

Iowa Silent Reading Test: 
New Edition—Advanced 
Test 

(World Book Company) 
Nelson-Denny Reading 
Test 

(Houghton Mifflin 
Company) 

SRA Reading Record 
(Science Research 
Associates) 


Evaluating Major Objectives and. Situations 


DATE 
1947 


1953 


1939 


1954 


1952 


1950 


1950 


1948 


1938 


1947 


CONTENT 


Reading comprehension, 
vocabulary 


Ability to understand gen- 
eral meaning, discriminate 
between words, evaluate 
factual material, interpret 
writer's attitudes 


Ability to comprehend 
words, to comprehend the 
general significance of a 
paragraph, to note details, 
to predict outcomes 

Ability to understand main 
idea, to infer logical ideas, 
to grasp details, to under- 
stand words in context 
Word meaning, paragraph 
comprehension 


Reading vocabulary, read- 
ing comprehension 


Vocabulary, level of com- 
prehension, speed of com- 
prehension 


Rate of reading, compre- 
hension, word meaning, 
ability to use skills in locat- 
ing information 


Vocabulary, paragraph 
reading 


Rate, general comprehen- 
sion, paragraph meaning, 
sentence meaning, general 
vocabulary, etc. 


GRADE 


3-4, 
5-6, 
7-9 


4-9 


3-9 


4-6 


7-12, 
11-12 


9-12 


9-12 


9-12 


rr ç Żđ— 


Evaluating Achievement in Language Arts and Mathematics 251 


The following test exercise in reading comprehension has been taken 
from the Cooperative Test of Reading Comprehension, Lower Level, 
Form T. The test items measure ability to note details, to infer rela- 
tionships, and to draw conclusions about the content and tone of the 


passage. 


“Alice!” called a voice. 

. The effect on the reader and her listener, both of whom were 
sitting on the floor, was instantaneous. Each started and sat rigidly 
intent for a moment; then, as the sound of approaching footsteps 
was heard, one girl hastily slipped a little volume under the cover- 
let of the bed, while the other sprang to her feet and in a hurried, 
flustered way pretended to be getting something out of a tall 


wardrobe. 
Before the one who hid the book had time to rise, a woman of 


fifty entered the room, and after a glance, cried, "Alice! How often 
have I told you not to sit on the floor?" 

"Very often, Mommy," said Alice, rising meekly, meantime cast- 
ing a quick glance at the bed to see how far its smoothness had 
been disturbed. 

“And still you continue such unbecoming behavior." 

“Oh, Mommy, but it is so nice!” cried the girl. "Didn't you like to 
sit on the floor when you were fifteen?" 


arently 4. When she heard her name called, 
Alice was evidently 
4-1 reading to herself. 

1-3 her pet cat. 4-9 reading aloud. 

1-4 the family dog. 4-8 lying in bed. 

1-5 adol...... TREE 4-4 dressing 

4-5 making her bed. ...4( ) 


l. Alice's companion was app 
l-l a girl. 
1-2 her brother. 


2. When Alice heard the approach- 


ing footsteps, she was principally 5. Alice was worried about the ap- 


pearance of the bed because 


zl uum he had neglected 
€ angry. 5-1 "rm neglected to make 
9.4 — 5-2 her companion had been 
2-5 a. 2) sitting on it. 
amused. ones 2 net * 5-3 her companion was hiding 
3. w . uu under it. . 

e may infer that Alice is 5-4 she had piled several books 
3-1 rarely disobedient. on it. 
8-2 unable to read. 5-5 she was afraid her mother 


3-8 very much in love. might find the book. 5( ) 


3-4 a spoiled child. 
8-5 fifteen years of age. 3( ) 


252 Evaluating Major Objectives and Situations 


The following test exercises on word meaning, or vocabulary, taken 
from the Stanford Achievement Reading Test, Intermediate, Form J, 
illustrate one method of testing for achievement in this aspect of 
reading. 


1. A sawmill makes-1l wire 2 boots 8 needles 4 lumber..... 16 ) 
2. A pair means-5 many 6 one 7two 8 three............... 29( ) 
8. Mary Smith and John Doe are cousins if they have the same— 

l grandmother 2 mother 3 sister 4 daughter.............. 3( ) 
4. To receive a letter means to—5 mail it 6 get it 7 write it 

8 S80 TGs um 5 sue oo anri uites oan c wld ep Moree T 2 REA 3 4( ) 


Generally, instruction in the elementary school stresses the mechan- 
ies of reading, whereas the secondary school and college stress the 
non-mechanical or thinking skills related to the reading process. Al- 
though these higher skills are measurable, it is difficult to define them 
and to construct items with which to measure them. Different tests of 
reading skills at the secondary-school level apparently measure differ- 
ent abilities in reading comprehension, hence, the tests are intercor- 
related only slightly. The reason for the relatively low correlation of 
the results among the different reading tests at the higher level may 
be attributed to the fact that no single analysis of reading skills at 
that level has been generally accepted. Also, the low correlation may 
indicate that the reading skills which secondary schools emphasize 


in instruction are numerous and not adequately sampled in any one 
given test (8). 


Evaluating Readability of Printed Materials Research in readabil- 
ity began with the attempt to classify the content of textbooks and 
other printed materials according to the level of comprehension diffi- 
culty, for example, fourth-grade, fifth-grade, sixth-grade, and other 
levels of average reading ability required for reasonable understand- 
ing of the content. The readability formula is a technique for esti- 
mating levels of difficulty in various reading materials. Several read- 
ability formulas are now used. The Lorge (8, 9) formula, for exam- 
ple, computes grade level difficulty on the basis of average sentence 
length, percentage of hard words, and prepositional phrases. The 
Flesch (5) formula is based on average sentence length and syllables 
per hundred words. An interest measure can also be computed on 
the basis of “personal words” and “personal sentences,” The Dale- 


Chall (4) formula is based on a count of unfamiliar words and aver- 
age sentence length. 


Evaluating Achievement in Language Arts and Mathematics 253 


Applying the Lorge formula to Lincoln’s Gettysburg Address, the 
grade placement or level of readability is 6.5, or sixth grade fifth 
month. This is calculated as follows: 


1. Number of words in the passage 269 

9. Number of sentences in the passage 10 

3. Number of prepositional phrases 26 

4. Number of hard words in the passage 48 

Computation 

Average sentence length: Divide 1 by 2-2690x .07= 1.8830 
Ratio of prepositional phrases: Divide 8 by 1 = .0967 X 18.01 = 1.2582 
Ratio of hard words: Divide 4 by 1 = -1599 x 10.78 = 1.7151 


Constant Weight = 1.6126 
Readability Index = 6.4694 


The Lorge formula is representative of other readability formulas in 


which statistical weights are applied to structural elements in a read- 
ing selection. The meaning of the index is the school grade at which 
the passage can be understood by the average pupil. The index can 
be used to place texts and other books at appropriate grades. Also, it 
indicates ways in which passages may be rewritten for appropriate 


placement at designated grade levels. 


Work-Study Skills Work-study skills, which are an aspect of read- 
ing, cut across many subject areas and, therefore, are discussed briefly. 
One of the earliest tests of these skills was Test B: Work-Study Skills 
of the Iowa Every-Pupil Test battery. This test provides measures of 
ability to read maps, graphs, charts, and tables, ability to find a topic 
in an appropriate reference book, and ability to use an index. Various 
library usage tests, likewise, have been constructed. In 1952 the Stan- 
ford Achievement Test battery included a subtest on similar study 


skills for the intermediate and advanced forms, and the Spitzer Study 


Skills Test for high-school and college levels was published. Still more 


recently, the SRA Achievement Series has also included a test of work- 
study skills at the elementary level. Representative study skills tests 


are listed in Table 6. 

The following test exercises will indicate how some of the study 
skills are measured. The ability to read charts, tables, and maps is 
checked by reproducing the data in pictorial form and testing com- 

test items referring to the 


prehension through a series of questions or 
drawing. The bar graph test exercises on the next page appear in the 
Test, Intermediate, Form J. 


Stanford Achievement Study Skills 


254 Evaluating Major Objectives and. Situations 


Use the bar graph below in answer- 1 Which state has the highest average elevation? 


ing questions 1-5. a Utah b Montana c Arizona d Kansas: 
` 2 What is the average elevation of Kansas? 
—€— ^ “1000 ft. f 1500 ft. g2000ft. h 3000 ft. 2 
uae 3 The highest average elevation is nearest — 
ee a 5750 ft. b 6000ft. c 6500ft. d 7000ft.s 
Arizona | 4 The average elevation of Arizona is closest to that 
ii: 7 © of— eUtah fNewYork gKansas A Montana. 


2 3 4 5 6 
f fe 
EE 5 How many of thestates listed on the graph have aver- 
age elevations of 2000 feet or more? 
ai b2 c3 d4 


Illustrative 
TABLE 6 
Tests of Work-Study Skills 
TEST AND PUBLISHER DATE CONTENT GRADE 


Iowa Every-Pupil Tests 1940 


Map reading, use of refer- 8-5, 
of Basic Skills: Test B. 


ences, use of index, use of 6-8 
Work-Study Skills—Ele- dictionary, — alphabetizing, 


mentary and Advanced reading of graphs, charts, 
(Houghton Mifllin tables 
Company) 


Peabody Library Informa- — 1940 Library information, library 4-8 
tion Test—Elementary skills 


Level 
(Educational Test 
Bureau) 


Spitzer Study Skills Test 1952 


Using dictionary, index, 9-12 
(World Book Company) 


sources of information, un- 
derstanding graphs, tables, 
maps, note taking 


Using index, table of con- 4-6 
tents, reference materials, 
interpretation of graphs 


SRA Achievement 1954 
Series—Intermediate: 
Work-Study Skills 


(Science Research and tables 

Associates) 

Stanford Achievement 1952 Reading charts and tables, 5-6, 
Test, Study Skills—Inter- maps, using dictionary, in- 7-9 
mediate and Advanced dex 


(World Book Company) 


 Yn——À———ÀMÁ 11i. ———— ———— 97 


The following test exercises from the Iowa Ever 
Work-Study Skills, Elementary Battery, Form N, 
urement of skill in using references. 


y-Pupil Test of 
illustrate the meas- 


Evaluating Achievement in Language Arts and. Mathematics 255 


1. Which would you use to find what a fable is? 
An atlas — ——A geography book 


A dictionary The Arabian Nights 
2. If you wanted to learn about cork, which of these would you 
use? 
An encyclopedia — ——An atlas 
A. dictionary — — A history book 
8. Which tells the correct way to divide Christmas at the end of a 
line? 
A speller — —A language book 
——A dictionary . —. An encyclopedia 


In a like manner, the ability to use an index or a dictionary is meas- 
ured by reproducing facsimile pages and asking questions which re- 
quire the pupil to use these facsimile pages in answering the questions. 


TABLE 7 Illustrative 
Literature Tests 
TEST AND PUBLISHER DATE CONTENT GRADE 
Carroll Prose Apprecia- 1985 Ability to differentiate good 7-9, 
tion Test prose from poor, and poor 10-12, 
(Educational Test from very bad 18-16 


Bureau) 


Center-Durost Literature 1952 Identification of excerpts 
Acquaintance Test from books worth reading 


(World Book Company) 


11-13 


Cooperative Literary 1951 Perception of authors view- 10-16 
omprehension and Ap- point, recognition of literary 

Preciation Test devices, appreciation of 
Educational Testing style and rhythm 

Service) 

lowa Tests of Educa- 1948 Ability to understand a pas- 9-12 

tional Development: Test sage—its purpose, mood, 


7. Interpretation of Liter- literary devices used 


ary Materials 

(Science Research Associates) 
Rigg Poetry Judgment 

Test 

(Bureau of Educational 
soo and Service, 

tate University of Iowa) 


1942 Ability to differentiate par- 9-16 
ody from correct version of 
poem; brief extracts used 


256 Evaluating Major Objectives and Situations 


Literary Discrimination and Appreciation Techniques for appraisal 
in literature have been constructed especially for discrimination and 
appreciation. Most of these newer appraisal techniques in literature 
have tended to be of the objectively-scored type. Efforts have been 
made in the Cooperative Literary Comprehension and Appreciation 
Test, for example, to provide a valid measure at the high-school level 
of such insights and abilities as the student's perception of the mood 
of a selection, his emotional reactions to a passage, his recognition of 
mood or feeling and tone of the passage, whether it is facetious or 
serious, animated or matter of fact, satirical 
Table 7 lists literature tests. For a more co 
praisal of literature tests, 
books (1, 2, 8). 

The following exercise is taken from 
prehension and Appreciation Test. 


, humorous, or burlesque. 
mplete listing and an ap- 
refer to Buros’ mental measurement year- 


the Cooperative Literary Com- 


A FOREIGN RULER 


(Written in the time of Napoleon) 
(1) He says, My reign is peace, so slays 
(2) A thousand in the dead of night. 
(3) Are you all happy now? he Says, 
(4) And those he leaves behind cry Quite. 
(5) He swears he will have no contention, 
(6) And sets all nations by the ears; 
(7) He shouts aloud, No intervention! 
(8) Invades, and drowns them all in tears, 


15. This poem is applicable to recent times because of 
15-1 the modernity of its verse form. 
15-2 its comforting philosophy. 
15-3 jie, comparison to be drawn between Napoleon and 
itler. 
15-4 its international spirit. 
15-5 its anti-interventionist attitude. 


16. The peo; 
because 


16-1 their enemies have been killed. 

16-2 they have a good ruler. 

16-3 they are afraid to say that they are not. 

16-4 they are glad to be at peace. 

16-8 they have been drowned. |... 16( ) 


y ofc ee ae 15( ) 
ple mentioned in line 4 Say that they are happy, 


17. In this stanza italics are used to 


17-1 change the rhythm. 
17-2 give an impression of shouting. 


Evaluating Achievement in Language Arts and Mathematics 257 


17-3 keep the reader from reading too fast. 

17-4 create emphasis. 

17-5 take the place of quotation marks. .......... M( ) 
18. The writer’s attitude toward the foreign ruler is one of 


18-1 admiration. 

18-2 submission. 

18-3 defiance. 

18-4 antagonism. 

18-5 gratitude. ©... eee 18( ) 


EVALUATION OF WRITING 

Grammar, Usage, and Spelling In attempts to measure writing skills 
more economically, various standardized objective-type tests have 
been constructed. While there is a relatively high correlation between 
the rating of pupil products in writing and the results of such tests, it 
is true that these objective tests do not measure certain of the aspects 
and interrelationships in writing that may be obtained from the scor- 
ing of a pupil essay, composition, or similar writing effort. 

In addition to the tests listed in Table 8, most of the achievement 
test batteries contain subtests on usage and spelling. Separate spelling 
tests are available also from most commercial publishers of standard 
tests, For a more complete listing and description of these, see the 
mental measurements yearbooks edited by Buros (1, 2, 8). 


Analysis of Written Products for Content and Expression The second 
major aspect of evaluating writing includes the measurement of the 
individual's ability to express ideas in various written forms. For many 
Years the essay, theme, or composition has been the teacher's method 
for measuring this outcome of the language arts. For evaluation, this 
Objective has generally been defined in terms of related skills and 
abilities; Some of the more commonly defined abilities are (a) writ- 
ing grammatically correct sentences or paragraphs, (b) organizing 
ideas, (c) spelling correctly, (d) expressing ideas in a form which is 
interesting and clear, and (e) capitalizing and punctuating a sentence 
Or a paragraph correctly. 

All of = abilities m many others, it has been assumed, can be 
Measured adequately in a student's or learner's theme, essay, or com- 
Position. The conventional method of appraising a composition has 
been for the teacher to make an overall evaluation of it on the basis 
of general impression. The evidence indicates that these overall im- 
Pressions are usually based on rather superficial qualities of the writ- 
ten product. If an overall appraisal of the written product is to have 


258 Evaluating Major Objectives and Situations 


meaning, reliability, and validity, it is necessary to define the factors 
which enter into the total evaluation and to assign or determine the 
weight to be given to each factor. For example, if a written product is 
to be evaluated on the basis of such factors as spelling, punctuation, 
paragraphing, grammar and syntax, accuracy, vocabulary, power of 
expression, and general impression, then each of these factors must be 
rated separately before combining them into a total. Furthermore, the 
weight to be assigned to each of these categories must be determined. 
Stalnaker (11) and others recommend a compromise between the de- 
tailed analytic rating scale method and the overall general impression 
method. In this compromise, before the teacher rates the composition 
on the basis of overall impression, he defines the main factors that are 
to be kept in mind in forming this impression. 

The principles of rating a written product are essentially those in- 
volved in scoring the essay examination. For a more detailed discus- 
sion and illustration of these principles and techniques, see Chapter 


Six. 

Illustrative 
TABLE 8 

Tests of Expression (Grammar, Usage, Spelling) 

TEST AND PUBLISHER DATE CONTENT GRADE 

Elementary School 
California Language 1950 Mechanics of English, 1-8, 
Test-Primary, Elemen- grammar, spelling 4-6. 
tary, and Intermediate T-9 
(California Test Bureau) 
Iowa Every-Pupil Tests 1940 Punctuation, italizati 5 
of Basic Skills: Test C. usage, pu om ag 


Basic Language Skills— 
Elementary and Advanced 
(Houghton Mifflin 
Company ) 


Iowa Language Abilities 1948 
Test—Elementary and 
Intermediate 

(World Book Company) 


SRA Achievement Series— 1954 
Intermediate: Language 

Arts 

(Science Research 

Associates ) 


Spelling, word meaning, 4-7, 
language usage, capitaliza- 7-10 
tion, punctuation 


Capitalization, punctuation, 4-6 
language usage, grammar 


Evaluating Achievement in Language Arts and Mathematics 259 


TEST AND PUBLISHER DATE CONTENT GRADE 


Secondary School 


Barrett-Ryan-Schrammel 1952 Functional grammar, punc- 9-12 

English Test: New Edition tuation, the sentence, vo- 

(World Book Company) cabulary, pronunciation 

California Language 1950 Mechanics of English, 9-12 

Test—Advanced grammar, spelling 

(California Test 

Bureau) 

Cooperative English Test: 1950 Grammar, punctuation, 7-12, 
Test A. Mechanics of capitalization, spelling 11-12 


Expression—Lower and 
Higher Levels 
(Educational Testing 


Service) k 
7-12 

Cooperative English Test: 1950 Sentence structure and H 

Test Bl. p^ p eran of style, verbal skill, diction, 11-12 

xpression—Lower and organization 

Higher Levels 

(Educational Testing 

Service) 

Iowa Tests of Educational 1948 Punctuation, usage, capital- 9-12 

Development: Test 8. Cor- ization, spelling, diction, 


rectness and Appropriate- phraseology, organization 


ness of Expression 
(Science Research 
Associates) 


BR EEUU MELLE 


Handwriting In penmanship, the measure is mainly the legibility of 
the individual's writing. The evaluation. is mainly in terms of the per- 
formance or product of the individual. To evaluate this product, s 
Ous standardized product scales of penmanship, such as those by 
Conrad, Freeman, and others, are available for use. Informal teacher- 
made scales may also be used. The reproduction on pages = and 
261 of a part of a product scale for measuring handwriting in Ves 
how the teacher may compare the child's product with a standar 
product to assign a scale value. " 
The Io. x of the handwriting product scales are substantially 
the same as those of any scoring device, such as standard answers for 
an essay examination. The scoring is partly subjective and pay ob- 
jective. The degree of objectivity obtained in the use of a scale is a 
function of the insight, experience, and background of the examiner 


who applies it. 


260 Evaluating Major Objectives and Situations 


FIGURE 9 A Handwriting Scale. The quality of writing 
value written above each sample. The complete scale i 


80, 40, 50, 60, 70, 80, and 90. 


is indicated by the number 
5 divided into values of 20, 


Illustrative 
TABLE 9 

Handwriting Scales 

TEST AND PUBLISHER DATE CONTENT GRADE 

Conrad Manuscript Writ- 1929 A scale with which hand- 1-8 
ing Standards writing samples are com- 
(Bureau of Publications, pared for quality 
Teachers College, 
Columbia Univ.) 
Freeman Handwriting 1935 A scale with which hand- 1-6 
Measuring Scale writing samples are com- 
(Zaner-Bloser) pared for quality. 
West Chart for 1926 Measures letter form, slant, 1-6 


Diagnosing Elements of 
Handwriting 
(Public School Publ. Co.) 


coordination, motor control, 
and spacing 


—_——— —M a 


EVALUATION OF SPEAKING AND LISTENING 


Although speaking and listening abilities and skills are considered 
essential parts of the language arts, few standard tests or scales are 
available for general use. Compared with reading and writing, re- 
search and experimentation in the measurement of speaking and 


Evaluating Achievement in Language Arts and. Mathematics 261 


, anol leet e te 


the z thim that atl 


? earl eenig hat- 
OV any naon AD COTE — 
Cred ark se gheclicatecl 


Scale by Leonard P. Ayres, published by Russell Sage Foundation, 1935. 


listening are relatively limited. Some rating scales for speech are listed 
in Buros (1, 2, 8), and the World Book Company has published a 


listening comprehension test. 


Speech Scales Two scales for speech, Speech Attitude Scale and 
Speech Experience Inventory, have been published by the C. H. 
Stoelting Company. The Speech Attitude Scale for grades 9-16 con- 
tains items to provide information for discovering and diagnosing the 
individuals attitudes toward speech. The Speech Experience Inven- 
tory for grades 9-16 consists of items designed to discover the range 
and kinds of speech experiences in which an individual engages. Tech- 
nically, these are scales or checklists rather than tests of abilities and 
Skills in speech or speaking. 

To evaluate oral composition, Netzer (10) experimented with a 
technique analogous to product scales in writing. Just as the quality 
of a sample of writing may be measured by comparing it with a series 
of samples graded in quality, so may the quality of a student's oral 
Communication be measured by comparing it with a series of samples. 

etzer’s scales classify an elementary-school pupil's verbal responses 
toa Story, to a picture, and to an object. These scales use recordings 
that are graded in quality. This technique is in an experimental stage, 
it offers possibilities for development of evaluation in new direc- 

Ons, 


262 Evaluating Major Objectives and. Situations 


Listening Tests The Brown-Carlsen Listening Comprehension Test 
(World Book Co.) for grades 9-13 is the first to appear in commer- 
cially available form. It is designed to measure the ability of pupils to 
comprehend what they hear. The administration of the test is entirely 
oral, and may be given to groups of regular class size. The seventy-six 
text questions are organized into five parts to appraise immediate 
recall, following directions, recognizing transitions, recognizing mean- 
ings, and comprehending a lecture. 

The appearance of this test may stimulate further research and 
experimentation in the measurement of abilities and skills in listening. 
It represents a new and promising step in the measurement of a 
neglected objective of the language arts, 


Mathematics 


Standardized tests in mathematics have been constructed for a 
variety of objectives. At the elementary-school level computational 
skills and problem-solving are measured in the conventional subtests 


of the Metropolitan, Stanford, California, Madern School, and other 
achievement test batteries, Dia 


ing strengths and weaknesses i 
structed. The better known tests 
metic Tests and the Compass Dia 


majority of achievement tests in arithmetic deal with computation and 
problem solving, the types of items measure skill in the four funda- 


for whole numbers, mixed numbers, fractions 
years, however, there has been a tren 
which measure such objectives as arithmeti 
quantitative relationships, mathemati 
matical judgment. 

At the junior-high-school and h 
achievement tests are found. The first type may be illustrated by the 
Cooperative Mathematics Test for grades 7, 8, and 9, which empha- 
sizes arithmetical skills, terms and concepts, applications, and appre- 
ciations. The second general type of tests measures knowledge, skill, 
and understanding in a special mathematical field, such as algebra, 
plane geometry, or trigonometry, 


A discussion of aptitude or Prognostic tests in mathematics will be 
found in Chapter Seventeen. 


igh-school level, two types of 


Evaluating Achievement in Lafiguage Arts and. Mathematics 


TABLE 98 Illustrative 


Mathematics Tests 


TEST AND PUBLISHER 


Elementary School 


Brueckner Diagnostic 
Arithmetic Tests 
(Educational Test 
Bureau) 


California Arithmetic 
Test—Primary, Elementary, 
and Intermediate 
(California Test Bureau) 


Compass Diagnostic Tests 
in Arithmetic 
(Scott, Foresman and Co.) 


Functional Evaluation in 

Mathematics—Elementary 

and Advanced 

(Educational Test 
ureau ) 


Towa Every-Pupil Test of 
Basic Skills: Test D. Basic 
Arithmetic Skills—Elemen- 
tary and Advanced 
(Houghton Mifflin 
Company) 


Metropolitan Achieve- 
ment Test, Arithmetic— 
Elementary, Intermediate, 
and Advanced 

(World Book Company) 


SRA Achievement 

Series—Intermediate: 

Arithmetic 

(Science Research 
Ssociates ) 


Stanford Achievement 

Test, Arithmetic—Elemen- 

tary, Intermediate, and 
vanced 

(World Book Company) 


DATE 


1940 


1950 


1925 


1952 


1940 


1947 


1954 


1952 


CONTENT 


Three test booklets—whole 
numbers, fractions, deci- 
mals 


Computation, problem solv- 
ing 


Twenty tests, each cover- 
ing a basic arithmetic proc- 
ess in detail 


Computation, problems, 
quantitative understanding 


Arithmetic vocabulary, 
computational skill, whole 
numbers 


Arithmetic fundamentals, 


problems 


Arithmetic reasoning, arith- 
metic vocabulary, number 
recognition, computation 


Computation, problem solv- 
ing 


263 


GRADE 


264 


TEST AND PUBLISHER 


Secondary School 


Cooperative Algebra Test. 
Elementary Algebra 
through Quadratics 
(Educational Testing 
Service) 

Cooperative Intermediate 
Algebra Test. Quadratics 
and Beyond 
(Educational Testing 
Service) 

Cooperative Mathematics 
Test for Grades 7, 8, and 
9 

(Educational Testing 
Service) 

Cooperative Plane 
Geometry Test 
(Educational Testing 
Service) 

Cooperative General 
Mathematics Test for 
High School Classes 
(Educational Testing 
Service) 

Davis Test of Functional 
Competence in Mathe- 
matics 

(World Book Company) 
Foust-Schorling Test of 
Functional Thinking in 
Mathematics 

(World Book Company) 
Towa Tests of Educational 
Development: Test 4. 
Ability to Do Quantitative 
Thinking 

(Science Research 
Associates) 

Lankton First-Year 
Algebra Test 

(World Book Company) 
Seattle Algebra Test 
(World Book Company) 


Evaluating Major Objectives and. Situations 


DATE 


1940 
et seq. 


1940 
et seq. 


1940 
et seq. 


1940 


et seq. 


1988 


1950 


1944 


1948 


1951 


1951 


CONTENTS 


Basic skills and principles, 
processes, relationships 


Exponents, factoring, pro- 
gressions, logarithms, imag- 
inary numbers, graphs, etc. 


Skills, facts, terms and con- 
cepts, applications, appre- 
ciations 


Fundamentals of first 
course in geometry 


Arithmetic, algebra, plane 


geometry, trigonometry, 
solid geometry 


Consumer problems, graphs 
and tables, equations, etc. 


Ability to think in terms of 
concepts and symbols of 
mathematics independent 
of computational ability 

General mathematics in- 
volving practical problems 


Vocabulary, meaning and 
use of symbols, formulas, 
graphs, etc. 

Basic terms, formulas, 
binomials, simultaneous 
equations 


GRADE 


End of 


course 


End of 
course 


7-9 


End of 
course 


12 


9-12 


9-12 


9-12 


End of 


course 


End of 
course 


Evaluating Achievement in Language Arts and Mathematics 265 


TEST AND PUBLISHER DATE CONTENTS GRADE 
(aes Plane 1950 Vocabulary, construction, End of 
eometry Test computational skills, logical course 
(World Book Company) proof 
a General 1950 Arithmetical concepts, in- End of 
athematics Test formal geometry, graphs, course 
(World Book Company) algebraic principles 


The test exercises used to measure computational ability and prob- 
lem solving are well known to most persons. The following exercises 
from Test 1—Quantitative Understanding of the Functional Evalua- 
tion in Mathematics Series illustrate an emphasis on the measurement 
of meaning in mathematics. 

Which one of the following is the largest number of pages? 
A. 199 pages B. 109 pages C. 201 pages D. 210 pages 
If you know the price of one bushel of apples, what is the quickest 


way to find the cost of 16 bushels of apples? 
A. adding B. subtracting C. multiplying D. dividing 


From the Cooperative Mathematics Test for grades 7, 8, and 9, Form 


Y, the following exercises are reproduced: 


17. Of the following, which information might, in itself, be best 
represented on a grap P 
1. The number of pup 
arithmetic test 
2. The average speed of a boat on a certain trip 
3. The student receiving the highest grade in the final ex- 
amination in English 
4. The hour of a particul 
highest 
5. The number of absences in a c 
the year 
29. “If equals are added to equals, the resul 
of the following is the best example of that statement? 
1.14+8=8+6 
2.24+k=24+k 
8. 84-8 254-5 
4.44+0=4 
5.242=4 
Arithmetic ranks with reading and writing as a basic tool of com- 
munication in modern culture. It is intrinsic to most activities of daily 


ils making no mistakes on a certain 


ar day when the temperature was 


lass during each month of 


ts are equal.” Which 


266 Evaluating Major Objectives and Situations 


living—buying, making change, telling time, measuring, keeping score, 
and similar activities. Tests have been devised to measure computa- 
tional skills, problem solving, and understanding of the meaning of 
the number system. Diagnostic tests in arithmetic permit an analysis 
of the strengths and weaknesses of an individual pupil. In a like man- 
ner, tests of algebra and plane geometry stress not only skills but also 
meanings of the processes involved. These tests help the modern 
school to evaluate the effectiveness of its mathematics program. 


The following illustrative example is cited from the Lankton First- 
Year Algebra Test. 


12. If n+ 2n = 12, then the value of n is 
al b2ce8 d4 es 


An exercise from the Shaycroft Plane Geometry Test illustrates a 
type of item used to measure achievement in this subject. 


53. In parallelogram ABCD above, AB = BC = CD = 
DA =6. What is the area of the parallelogram? 
a. 18 
b. 24 
c. 36 
d. impossible to determine answer without. addi- 
tional information 
e. none of the above 


Test exercises to measure the objectives and outcomes of mathe- 
matics are in the process of improvement and refinement. New tests 
for new objectives are being developed. By means of standard tests 
and informal evaluation techniques, teachers are appraising the 


achievement of pupils in mathematics courses from the first grade 
through the college years of learning. 


Evaluating Achievement in Language Arts and. Mathematics 267 


Summary 


Representative achievement tests in the language arts and mathe- 
matics are reviewed and briefly discussed in this chapter. The lan- 
guage arts include abilities and skills in reading, writing, speaking, 
and listening, each of which is ordinarily divided into subordinate 
abilities and skills. Writing, for example, includes penmanship, spell- 
ing, grammar, and composition. Mathematics includes abilities, skills, 
and concepts in arithmetic, algebra, and geometry. Standard tests in 
the language arts and mathematics have been constructed to appraise 
the several objectives of each curriculum area. à 

Achievement tests in the basic skills at the elementary-school level 
are frequently published in battery form, bound in a single booklet. 
At the secondary-school level, the common practice is to publish a 
series of tests for each school subject, each in a separate booklet. 

In order to measure achievement in the basic skills adequately, 
standard tests should have curricular validity; that is, they should 
measure fairly the extent to which pupils have learned the curricular 
experiences provided in the schools. Since the abilities and skills in 
the language arts and mathematics are fairly uniform in the curriculum 
of most communities, the available standard tests tend to be valid 
measures of achievement for objectives of these subjects. In reading, 
these objectives include reading readiness, silent and oral reading, 
readability of printed materials, work-study skills, and literary 
appreciation and discrimination. In writing, representative tests of 
achievement are discussed for the objectives of grammar, spelling, 
composition, and handwriting. For appraisal of speaking and listening, 
some available speech scales and a listening test are briefly described. 
In mathematics, representative standard tests of achievement are cited 
for arithmetic, algebra, and geometry. Representative achievement 


tests in the various subject-matter categories are cited, but it is sug- 


gested that the reader consult the series of mental measurement year- 
plete information about specific 


books edited by Buros for more com 
tests. 


Problems for Class Discussion 


ading test administered to a class. Indicate the 


1. Analyze the results of a re 
required to meet the individual differences in 


range of reading materials 
ability. 

2. Administer a language usag 
a diagnostic summary of group nee 


e test to a small group of pupils. Prepare 
ds for instructional purposes. 


268 Evaluating Major Objectives and Situations 


8. Construct an informal mathematics test. Administer it to several pupils 
and diagnose any individual weaknesses revealed. 


References Cited in This Chapter 


1. Buros, O. K., editor, The Nineteen Forty Mental Measurements Year- 
book. Highland Park, N. J.: The Mental Measurements Yearbook, 1941. 

2. Buros, O. K., editor, The Third Mental Measurements Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949. 

3. Buros, O. K., editor, The Fourth Mental Measurements Yearbook. High- 
land Park, N. J.: The Gryphon Press, 1953. 

4. Dale, E., and Chall, J. S., “Formula for Predicting Readability” and 
“Instructions,” Educational Research Bulletin, 24:11-20 and 87-54, 
January-February, 1948. 

5. Flesch, R., “A New Readabili 
ogy, 32:221-233, June, 1948. 


6. Forty-Fifth Yearbook of the National Society for the Study of Education, 


Part I—“The Measurement of Understanding.” Chicago: University of 
Chicago Press, 1946, 


ty Yardstick,” Journal of Applied Psychol- 


7. Lindquist, E. F., editor, Educational Measurement. Washington, D. C.: 
American Council on Education, 1951. 


8. Lorge, I., “Predicting Readability," Teachers College Record, 45:404— 
419, March, 1944. 

9. Lorge, I., "Readability Fo 
26:86—95, February, 1949. 


10. Netzer, R. F., "The Evaluation of a Technique for Measuring Improve- 
ment in Oral Composition," 


Journal of Experimental Education, 
6:35-89, 1937. 


rmulae. An Evaluation," Elementary English, 


11. Stalnaker, J. N., "Essay Examination: 


s Reliably Read," School and So- 
ciety, 45:671-672, November, 1937. 


References for Further Reading 


Brueckner, L. J., and Grossnickle, F. C., How to Make Arithmetic Meaning- 
ful. Philadelphia: John Winston Co., 1947. 


In Chapter 10 of this volume, the authors provide a comprehensive dis- 


cussion of methods for the informal and formal appraisal of various ob- 
jectives of arithmetic instruction. 


Commission on the English Curriculum, National Council of Teachers of 
English, The English Language Arts. New York: Appleton-Century- 
Crofts, Inc., 1952. 


In Chapter 18 of this volume, methods of evaluating instruction in the 
language arts are discussed. Emphasis is placed upon an evaluation pro- 


gram that is comprehensive, flexible, and continuous. Both informal and 
formal methods of appraisal are reviewed. 


Evaluating A chievement 


CHAPTER FIFTEEN 
in Selected Courses 


Compared with the basic abilities and skills that are 
universally sought in the language arts and mathematics, other courses 
in the curriculum vary considerably from locality to locality in their 
content and specific objectives. This variability makes difficult the con- 
struction of standard achievement tests for wide-scale use. In this 
chapter, achievement tests are briefly reviewed for such selected 
courses of the curriculum as social studies, natural sciences, music and 
art, foreign languages, industrial arts, home economics, and business 
education. 


CONTENT VALIDITY 

In the selected courses, the test user should be concerned with 
content, or curricular, validity. Before he uses a test in the social 
studies, for example, he should check the content of the test against 
the content of the local curriculum to determine whether or not the 
test material includes a representative sample of the abilities, informa- 
tion, knowledge, and skills which pupils have had an opportunity to 
learn. Unless the test measures fairly the objectives of the curriculum, 
its use in a specific school situation is harmful rather than beneficial. 
The tests of general educational development, Interpretation of Read- 
ing Materials in the Social Studies and Interpretation of Reading 


Materials in the Natural Sciences, distributed by the Educational Test- 


ing Service, and corresponding tests in the Iowa series distributed by 


Science Research Associates, are attempts to devise tests for wide-scale 
use regardless of the specific content of the local curriculum. As indi- 
cated by the titles of these tests, the emphasis is upon measurement 
of the interpretation of reading materials. Specific objectives and con- 
tent, except for courses that are fairly uniform in the schools of the 


nation, are best measured by tests constructed locally. 
269 


270 Evaluating Major Objectives and. Situations 


TESTS OF KNOWLEDGE AND PERFORMANCE 


The majority of standard achievement tests are pencil-and-paper 
types which measure knowledge acquired by pupils. This type of test 
is easier to administer and to score than the performance type of test. 
Performance tests are most commonly used to measure achievement 
in bookkeeping, shorthand, typing, drawing, sewing, and in the rating 
of products in industrial arts courses. In these tests, the pupil per- 
forms standard job samples and his product is rated on speed of per- 
formance and freedom from faults in workmanship. 


Social Studies 


The objectives of the social studies include the acquisition of basic 
concepts and information, study skills involving the reading of maps, 
graphs, and charts, the development of critical thinking, and growth 
in desirable attitudes and beliefs. In this chapter the discussion will be 
limited to the measurement of concepts, knowledges, and skills. 

At the elementary-school level, the testing or evaluation of concepts 
and information in history, geography, and civics most frequently ap- 
pears in the form of subtests of general achievement test batteries, 
such as the Stanford, Metropolitan, Modern School, and California 
series. It should be pointed out, again, that social studies tests need 
to be checked against the curriculum content of the local course of 
study to make sure that they are valid measures of the concepts, facts, 
and skills that are emphasized in the local school curriculum. The 
variation in content of the social studies from community to com- 
munity makes the measurement of specific facts and information dif- 
ficult. 

To some extent, this same caution should be observed at the sec- 
ondary-school level because of the variation in the content of social 
studies courses in junior and senior high schools from community to 
community. There is a tendency, however, for the content of Amer- 
ican history courses, world history courses, and other social studies 
courses to be more uniform in the higher grades. At the secondary- 
school level, moreover, new developments in tests have emphasized 
the measurement of understanding and study skills as well as the 
recognition and recall of specific items of information, 

In order to measure concepts, information, and understanding, 4 
variety of objective test exercises are used. Those types of exercises 
most frequently used in recent tests are matching items and multiple 
choice items. The matching items are best adapted to the definition 
of social studies concepts and terms and the identification of histor- 


Evaluating Achievement in Selected Courses 271 


ical personages, places, and times. The multiple choice items are bet- 
ter adapted to measure understanding and judgment. 
The following test exercise from a Cooperative American History 


Test illustrates the testing of terms. 


1. Socialism 64. The executive is responsible to the 
2. Despotism popular branch of the legislative 
8. Confederation Body se + aust § fare c me rime C) 
4. Parliamentary 65. The up epe dig the union 
government retain their sovereignty ..-.- 


66. The divine right theory tended to 
produce this type of govern- 
ment dur emer ra pr C 3 


The following matching test exercise from the Cummings World 


History Test illustrates the identification of personages. 


19, He established the practice of using chemicals a. Koch 
to destroy bacterial infections. b. Lister 

20. He directed the study which discovered the c. Pasteur 
cause of yellow fever. d. Pavlov 

21. He discovered and developed a treatment for e. Reed 


rabies. 
exercise from the Crary American 


The following multiple choice 
rement of understanding. 


History Test illustrates the measu: 


67. American workers organized in labor unions after 1865 to — 


a. seek the overthrow of capitalism. 
b. seek better wages and working conditions. 
c. seek political control of the government. 
d. aid immigrants in becoming better Americans. 
In Table 11, representative social studies tests have been organized 
under the headings: general social studies, history, civics, and geog- 


raphy. 
TABLE 11 Illustrative 
Social Studies Tests 
TEST AND PUBLISHER DATE CONTENT GRADE 


General Social Studies 


California Tests in Social 1954 
and Related Sciences, Parts 

I and I—Elementary 

(California Test Bureau) 


American heritage, peoples 4-8 
of other lands, geography, 
basic social processes 


272 


TEST AND PUBLISHER DATE 


1947 
et seq. 


Cooperative Social Studies 
Test for Grades 7, 8 and 

9 

(Educational Testing 
Service) 


Cooperative General 
Achievement Tests: 

Test 1. 

General Proficiency in the 
Field of Social Studies 
(Educational Testing 
Service) 


Stanford Social Studies 
Test 
(World Book Company) 


1947 
et seq. 


1953 


History 


California Tests in Social 
and Related Sciences, 

Parts I and II-Advanced 
(California Test Bureau) 


1954 


Cooperative American 
History Test 
(Educational Testing 
Service) 


1947 
et seq. 


Cooperative Modern 
European History Test 
(Educational Testing 
Service) 


1947 
et seq. 


Cooperative World 
History Test 
(Educational Testing 
Service) 


1947 
et seq. 


Crary American History 
Test 


(World Book Company) 


1950 


Cummings World History 
Test 
(World Book Company) 


1950 


Evaluating Major Objectives and Situations 


CONTENT GRADE 


Informational background, 7-9 
terms and concepts, com- 
prehension and interpreta- 

tion 


Social studies concepts, 10-13 
ability to interpret social 

studies material in maps, 

graphs, and reading selec- 

tions 


Informational background: 5-9 
items drawn from history, 


geography, and civics 


Creating the new nation, 
nationalism, sectionalism 
and conflict, emergence of 
modern America, the US. 
in transition 


9-12 


Basic facts and trends in 
economics, social and polit- 
ical development of the 
U.S. 


9-12 


Covers period from middle 
ages up to the present—un- 

erstanding of movements 
and institutions 


9-12, 
College 


Prehistoric times to the 
present—political, Social, 
and economic trends 


9-12 


Historical facts, under- 
standing of historical proc- 
esses, map skills, reasoning 
by inference 


Major historical events, 
dates, places, persons, and 
ideas in ancient, medieval, 
and modem history 


End of 


course 


End of 
course 


Evaluating Achievement in Selected Courses 


TEST AND PUBLISHER DATE 


Civics 
1947 
et seq. 


Cooperative Test in 
American Government 
(Educational Testing 
Service) 


Cooperative Test of 1941 


Community Affairs 
(Educational Testing 
Service) 


Geography 

National Achievement 1950 
Tests, Municipal Battery: 
Geography Test 

(Acorn Publishing Co.) 


Modern Geograpl d 

) graphy an 
Allied Social Studies 
(C.A. Gregory Company) 


1950 


. A comprehensive review © 
is provided by Wrightstone an 
The older reviews and listing 


CONTENT 


Conventional material gen- 
erally taught in course in 
American government 


Pupil information concern- 
ing community in which 
they live-geography, gov- 
ernment, health, welfare, 
employment 


Geographic concepts, prod- 
ucts, locations, customs, etc. 


Causal geography, trade 
routes, products, place 


geography, etc. 


273 


GRADE 


End of 
course 


9-12 


6-10 


f tests and measures in the social studies 
d Campbell (15) and by Wesley (14). 
of tests and measures can be brought 


up to date by consulting the mental measurements yearbooks by Buros 


(4, 5, 6). 


Natural Sciences 


For convenience in discussion, tests have been classified to corre- 
Spond with courses in sciences as follows: (a) general science, (b) 
biology, (c) physics, and (d) chemistry. These courses constitute the 
major types offered in the elementary and especially the secondary 
schools. At the elementary-school level, the measurement of concepts, 
terms, facts, and understandings in science is usually accomplished by 
means of subtests in a battery of general achievement tests. As with 
tests in social studies, the content of science tests should be checked 
against the curriculum content of a local course of study to determine 
how valid science tests will be for a particular course of study. It is 
recognized that the content of elementary science will vary consider- 
ably from community to community, thus making the measurement 
of specific facts, information, and concepts difficult. 


274 Evaluating Major Objectives and Situations 


At the secondary-school level, there is a likelihood of more uni- 
formity for courses, and progress in these courses will be more nd 
idly assessed by a standardized test than is the case in the elementary 
schools. Examination of published tests will reveal that the majority 
of standardized tests in biology emphasize the measurement of facts, 
information, and principles, as illustrated by the items of the Co- 
operative Biology Test. In the general science field, the objectives 
measured, as represented in the Cooperative General Science Test, 
include (a) facts, skills, and applications; (b) terms and concepts; 
and (c) comprehension and interpretation of general science ma- 
terials. This test emphasizes broader understandings rather than spe- 
cific facts and skills of general science. In the field of physics, most 
of the tests deal not only with facts, information, and concepts, but 
also with an understanding of facts and their relationships. The more 
modern tests have also stressed the ability to read scientific materials 
and to understand and master the mathematical skills which are a 
part of physics. In chemistry, available tests generally emphasize a 
mastery of factual or informational items and concepts. In this area, 
too, more recent tests are also stressing the ability to determine rela- 
tionships, the ability to solve chemical problems, and the ability to 
perform laboratory procedures. 

The types of test exercises used in achievement tests for the natural 
Sciences are similar to those used in the social studies. Since com- 
munication by drawings is often more effective than verbal presenta- 
tion alone, more pictorial illustrations are used. The following test 


exercise from the Read General Science Test is typical of this type 
exercise. 


28. The geological formation above consti 


6. volcanic action. 

T. erosion. 

8. folding. 

9. sedimentation in a running stream, 
10. movement in the earth's crust. 


tutes evidence of - 


In the natural sciences 


, evaluation and measurement are discussed 
in the Forty-Sixth Yearb 


ook of the National Society for the Study of 


Evaluating Achievement in Selected Courses 275 


Education, Part I (9). Chapter 8 of this yearbook discusses how to 
judge the results of instruction in elementary science, and Chapter 15 
discusses how to evaluate the outcomes of instruction in science at 
the secondary-school level. The series of yearbooks by Buros (4, 5, 6) 
may be consulted for information on specific science tests. 


1 = 
TABLE 12 llustrative 
Tests in the Natural Sciences 
TEST AND PUBLISHER DATE CONTENT GRADE 


General Science 
California Tests in Social 1954 Elementary — health and 4-8, 
and Related Science, Part safety, elementary science; 9-12 
IlI-Elementary and Advanced- physical and bi- 
Advanced ological science 
(California Test Bureau) 
Calvert Science Informa- 1935 Animal life, plant life, phys- — 4-6, 
tion Tests—Elementary ical changes, earth study, 7-9 
and Intermediate sky study, elementary 
(California Test Bureau) chemistry 
Cooperative Science Test 1947 Principles governing func- 7-9 
for Grades 7, 8, and 9 et seq. tioning of.common mechan- 
(Educational Testing ical and electrical devices, 
Service) natural phenomena of the 
physical world 


Cooperative General Sci- 1947 Familiar scientific phenom- End of 


ence Test (High School) et seq. ena and processes course 
Educational Testing 

Service) 

Read General Science 1950 Air, water, heat, light, End of 

Test sound, magnetism, electric- course 
(World Book Company) ity, human body, ete. 

Stanford Science Test 1954 Life science, earth science, 5-9 


conservation, health, safety, 
elementary physics, and 
chemistry 


(World Book Company) 


Biology 
Cooperative Biology Test 1947 Fundamental facts and End of 
Educational Testing etseq. principles basic to an un- course 
Service) derstanding of biology 
Nelson Biol t 1950 Knowledge and under- End of 
World dod Lamp) standing of biological facts, course 
terms, and principles 


276 Evaluating Major Objectives and. Situations 


TEST AND PUBLISHER DATE CONTENT GRADE 
Chemistry 
Anderson Chemistry Test 1950 ^ Chemical changes, solu- End of 
(World Book Company) tions, compounds, equa- course 


tions, ionization, scientific 
method, etc. 


Cooperative Chemistry 1947 Concepts and terms, reac- End of 
Test etseq. tions, properties, and prep- course 
(Educational Testing arations, etc. 

Service) 

Physics 

Cooperative Physics Test 1947 Mechanics, heat, light, End of 
(Educational Testing etseq. sound, electricity, etc. course 
Service) 

Dunning Physics Test 1950 Mechanics, heat, light, End of 
(World Book Company) sound, electricity, modern course 

physics 


——————Ó—M——À à 
Music and. Art 


While it is difficult to measure som 
art instruction, even by the use of 
made recently in methods for app 
in these two fields. In music, 
recognize and compre 
sical instrument, 


€ of the objectives of music and 
performance tests, progress has been 
raisal of the more tangible outcomes 
tangible outcomes such as the ability to 
hend notation, ability to play the notes on a mu- 


and the ability to recognize notes and melodies that 
are played, can be measured by record playing, 


and-paper techniques. More specific informatio: 
edges, and understandings in music m 
tests which are listed in Table 13 whi 


The measurement of music talent, or aptitude, which has also re- 
ceived the attention of test constructors, is reported in Chapter Seven- 
teen dealing with the testing of aptitudes, 

At both the elementary- and secondary-school levels, skills and abil- 
ities in graphic arts have been measured to some extent, The three 
types of tests now available in this field are ( 
tests, (b) art appreciation tests, and (c) 
ing scale is illustrated by the Kline-Carey 
art appreciation are represented by the 
The art abilities test is illustrated by the 
mental Abilities of Visual Arts and the Knauber Art Ability Test. The 
various tests that have been devised for the measurement of aptitude 
in graphic arts are reported in Chapter Seventeen, 


checklists, and pencil- 
n about skills, knowl- 
ay be found by reviewing the 
ch accompanies this discussion. 


a) drawing scales and 
art abilities tests. The draw- 
Drawing Scales. The tests in 
Meier Art Judgment Test. 
Lewerenz Test in Funda- 


Evaluating Achievement in Selected Courses 277 

Art appreciation is generally measured in current tests by having 
the pupil choose the better of a pair of specimens. One of each pair 
has been changed in a specific element from the original form. On the 
answer sheet the factor, or element, to be considered is stated for the 
Pupil as, for example, arrangement of wall and foreground, inclusion 
or omission of the horns on the animal, position of the figure, and di- 
rection of pine tree's main branch. 


OTC EE SEES 


s 
BSS? 
AES 


vase, taken from Meier Art Judgment 


In the pair of drawings of the i s 
lect the better one with particular 


Test, the examinee is asked to se 


attention to the location of the band. 
From the Lewerenz Tests in Fundamental Abilities in Visual Art, 


the following test exercise illustrates analysis of problems in cylindri- 


cal perspective. 


2. Mark with an (X) the edge of 
the flower pot that is incorrectly 
rawn. 


be measured by a test such 
ation and Appreciation from 


The informational aspects of music may 
as the Kwalwasser Test of Music Inform 


278 Evaluating Major Objectives and Situations 


which the following exercise on classification of orchestral instruments 


has been adapted. 


Directions: Place the number after each instrument which shows 
the classification to which it belongs. 


l. string section 

2. wood-wind section 
8. brass-wind section 
4. percussion section 


a. Violin 6 ^ 
b. Trumpet ( ) 
c. Xylophone (E) 
d. Bells ( J 
e. Flute (e 


f. Bassoon ( 
g. Celesta ( 
h. English Horn ( 
i Lute ( 
j. Ophicleide ( 


————— 


It is apparent from the exercises cited from standard tests of music 
and art that knowledge about principles and skills as well as some 
aspects of appreciation can be measured. However, there are limits 
to the objectives that can be measured by such tests. The exercises 
are restricted to recognition of correct or incorrect practices and to 
recognition or recall of information. To measure or evaluate the 
performance or product of an individual requires a more informal 
method of appraisal. This method is normally used by the classroom 
teacher to judge the pupil's progress in abilities and skills in music 


and art. 


Illustrative 
TABLE 13 


Tests in Music and Art 


TEST AND PUBLISHER DATE 


Music 


Beach Music Test 
(Bureau of Educational 
Measurements, Kansas 
State Teachers College of 
Emporia) 


1939 


Diagnostic Tests of 
Achievement in Music 
(California Test Bureau) 


1950 


CONTENT 


Knowledge of musical sym- 
bols, pitch discrimination, 
tone direction, time values, 
etc. 


Time signatures, major and 
minor keys, note and rest 


values, song recognition, 
etc. 


GRADE 


4-16 


4-12 


Evaluating Achievement in Selected. Courses 


TEST AND PUBLISHER 


Drake Musical 

Memory Test 

(Public School Publishing 
Co.) 

Kwalwasser-Ruch Test of 
Musical Accomplishment 
(Bureau of Educational 
Research and Service, 
State University of Iowa) 
Kwalwasser Test of 
Musical Information and 
Achievement 

(Bureau of Educational 
Research and Service, 
State University of Iowa) 
Musical Achievement 
Test 

(Bureau of Publications, 
Teachers College, Colum- 
bia Univ.) 

Providence Inventory 
Test in Music 

(World Book Company) 


Graphic Art 

Graves Design Judgment 
Test 

(Psychological Corp.) 
Kline-Carey Drawing 
Scales 

(Johns Hopkins Press) 


Knauber Art Ability Test 
(Alma J. Knauber, 3331 
Arrow Ave., Cincinnati, 
Ohio) 

McAdory Art Test 
(Bureau of Publications, 
Teachers College, Colum- 
bia Univ.) 

Meier Art Tests: I. Art 
Judgment 

(Psychological Corp.) 
Tests in Fundamental 
Abilities in Visual Art 
(California Test Bureau) 


DATE 
1934 


1927 


1927 


1933 


1932 


1948 


1933 


1935 


1929 


CONTENT 


Pupil indicates whether 
repetition of a given mel- 
ody is the same or differs in 
key, time, or notes 
Knowledge of musical sym- 
bols and terms, recognition 
of pitch names, detection 
of errors in time and pitch, 
note values, etc. 
Information about notation, 
instruments, signatures, 
musicians, etc. 


Musical notation, use of 
symbols, familiar melodies, 
musical terms, etc. 


Naming notes, signatures, 
rest values, melodies, etc. 


Perception of unity, domi- 
nance, balance, symmetry, 
proportion, etc. 

Series of samples for judg- 
ing performance in repre- 
sentation, design, and com- 
position 

Drawing design from mem- 
ory, completing design from 
given elements, etc. 


Requires pupil to rank in 
order of merit four varia- 
tions of the same theme on 
each of 72 plates 

Requires pupil to select 
better of 100 pairs of pic- 
tures 

Recognition of proportion, 
originality of line drawing, 
light and shade, visual 
memory, etc. 


279 


GRADE 


Age 8 
and 
above 


4-12 


9-16 


4-9 


7-16 


1-8 


7-16 


9-16 


280 Evaluating Major Objectives and. Situations 


Tests in music and art continue to be the subject of study and ex- 
perimentation. Lundin (11) has reported briefly on the development 
and validation of a set of musical ability tests. Beverley (1) discusses 
the art teacher and evaluation in a special publication on present-day 
art education. In Buros (4, 5, 6) critical reviews of various tests in 
music and art are available. 


Foreign Languages 


The objectives in the foreign languages have generally stressed the 
ability of the pupil to read, write, speak, and understand a foreign 
language. These objectives have received differential emphasis during 
the past decades. During World War II there was a renewed emphasis 
upon the objectives of speaking, or oral language, but in secondary- 
school instruction the major emphasis remains on reading and under- 
standing a foreign language rather than upon writing and speaking. 

The measurable objectives in the current foreign language tests may 
be classified as follows: (a) vocabulary or meaning of foreign words, 
(b) reading comprehension, (c) grammar, and (d) knowledge of 
cultural history and literature. In general, 


the tests used to measure 
the v. 


arious objectives have employed multiple-choice format. In 
design, foreign language tests—of both modern and ancient languages 
—are essentially the same. The usual sequence in a subtest on vocab- 
ulary, a subtest on reading comprehension, and a subtest on grammar. 
In a few tests a section on cultural history and literature of the foreign 
country is included. 

The items which follow are cited from a Cooperative French Test. 
They indicate the types of test exercises used to me 


asure abilities and 
skills in reading comprehension, vocabul 


ary, and grammar. 
Illustrative item to test reading comprehension 


J'avais cinq ans il y a vingt ans. Quel âge ai-je maintenant? 
l. dix ans 2. quinze ans 8. vingt-cinq ans 4. cinq ans 


5. trente-cinq ans 
Illustrative item to test vocabulary 


pensée 1. côté 2. vue 8. idée 4. effet 5. arrivée 


Illustrative item to test grammar 


est Spanish is spoken here. Ici ( ) parle espagnol. 


on Strawberries are eaten Les fraises ( 
s’est with sugar. 


) mangent 
avec du sucre. 


sont 


ae po po p 


se 


Evaluating Achievement in Selected Courses 


281 


Tests in other foreign languages use essentially the same format of 
items to examine abilities and skills in reading comprehension, vocabu- 
lary, and grammatical usage. In Table 14 typical tests for foreign lan- 


guages are listed. 


Illustrative 


TABLE 14 


Foreign Language Tests 


TEST AND PUBLISHER 


French 


American Council Alpha 
French Test 
(World Book Company) 


American Council Beta 
French Test 
(World Book Company) 


Columbia Research 
Bureau French Test 
(World Book Company) 
Cooperative French Test— 
Elementary and 
Advanced 

(Educational Testing 
Service) 


German 


American Council Alpha 
German Test 

(World Book Company) 
Cooperative German 
Test—Elementary and 
Advanced 

(Educational Testing 
Service) 


Italian 


Cooperative Italian Test 
(Educational Testing 
Service) 


DATE 


1927 


1926 


1942 


1927 


1937 


1947 


CONTENT 


Silent reading, recognition, 
vocabulary, English-French 
grammar 


Vocabulary, comprehen- 
sion, grammar 


Vocabulary, comprehen- 
sion, grammar 


Reading, vocabulary, 
grammar 


Vocabulary, grammar, 
reading, composition 


Reading, vocabulary, 
grammar 


Reading, vocabulary, 
grammar 


GRADE 


9-16 


7-11 


One 
year 
of 
study 


282 Evaluating Major Objectives and Situations 


TEST AND PUBLISHER DATE CONTENT GRADE 

Latin 

Cooperative Latin Test— 1942 Comprehension, grammar, End 

Lower and Higher Levels et seq. civilization of 

(Educational Testing two 

Service) years 
or 
more 
of 
study 

Spanish 

Columbia Research 1927 Vocabulary comprehension, 9-15 

Bureau Spanish Test grammar 

(World Book Company) 

Cooperative Spanish 1942 Reading, vocabulary, End 

Test—Elementary and grammar of 

Advanced two 

(Educational Testing years 

Service) or 
more 
of 
study 


PV 


Cheydleur and Schenck (8) discuss achievement examinations in 
foreign languages with particular reference to the credit systems used 
in secondary schools and colleges for placement and progress of the 


student. Buros (4, 5, 6) provides critical reviews of specific tests in 
the foreign languages. 


Industrial Arts 


Courses in industrial and vocational education generally include 
such activities as woodworking, carpentry, shopwork, metal work, 
electricity, printing, automobile mechanics, machine shop practice, 
and mechanical drawing. Although there are a variety of objectives to 
be attained in such courses, the tests which have been devised em- 
phasize the measurement of interests and attitudes (discussed in Chap- 
ters Sixteen and Nineteen) and the measurement of knowledge and 
skills in selected industrial arts or trades. Information about industrial 
arts or trades is generally measured by pencil-and-paper tests. These 
include mechanical drawing tests, mechanical comprehension tests» 
tests of woodworking, and tests for machinists, machine operators» 


Evaluating Achievement in Selected Courses 283 


electricians, and those engaged in similar industrial activities. Per- 
formance tests have recently been used to observe and rate the abili- 
ties and skills displayed in doing sample jobs involved in an industry 
or trade. 


INDUSTRIAL OR TRADE KNOWLEDGE TESTS 

Both written and oral industrial and trade knowledge tests are used. 
Written tests are generally used to measure achievement in the ac- 
quisition of knowledge about an industry or trade. The test exer- 
cises measure information about the materials and tools used in the 
shop; the routines, practices, and techniques used in the industry, 
and appropriate safety measures to be observed. Oral trade knowledge 
tests are used more extensively by employment agencies and person- 
nel workers to check on the knowledge a job applicant has about a 
specific trade. 

The following test exercises 
Test for Machinists and Machi 
used in written tests. 


from the Purdue Technical Information 
ne Operators illustrate one type of item 


87. If it fits, the wrench which will probably do the 
least harm to the corners of a nut is (a) an adjust- 
able, (b) an alligator, (c) an open end, (d) a socket 


abed 


SMESDG. luni ees epe tn t Rie NE un c 


75. To saw pipe, us 
12, (b) 16, (c) 20, (d) 24 teeth per inch ......-- abed 


The following item is cited from a test in auto mechanics, used in 


New York City high schools. 


y (a) worn clutch throw- 


A slipping clutch is usually caused b 
(c) worn clutch facings, 


out bearing, (b) worn pilot bearing, 

(d) excessive pedal clearance. 
Variations of such test exercises are provided by drawings of a ma- 
chine, part of a machine, or à tool accompanied by a series of ques- 
tions to test the individual's understanding of the industrial principles, 
practices, or uses involved. 
Limitations of informational tests about technical principles and 
processes are precisely that they measure knowledge and understand- 
ing about an industry or trade but may not necessarily measure ability 
to perform the skills on the job. There is no doubt that a positive cor- 
relation between knowledge and performance exists, but the magni- 
tude of this correlation has not been established for various industrial 


Operations. 


284 Evaluating Major Objectives and Situations 


Illustrative 
TABLE 15 à 
Industrial Arts Tests 

TEST AND PUBLISHER DATE CONTENT GRADE 
Examination in 1945 Symbols, methods, and 9-12 
Mechanical Drawing practices of mechanical 
(Educational Testing drawing 
Service) 
Purdue Blueprint 1949 Symbols, technical vocabu- 9-72, 
Reading Test lary, interpretation of draw- Adult 
(Science Research ings 
Associates ) 
Purdue Test for 1942 Care of tools, safety meas- 9-12, 
Electricians ures, home installations, Adult 
(Science Research splicing wires, batteries, 
Associates ) etc. 
Purdue Test for 1942 Machine shop practice, use 9-12, 
Machinists and Machine of lathe, shaper, planer, Adult 
Operators grinder, milling machine, 
(Science Research bench work 
Associates) 
Test of Mechanical 1951 Simply phrased questions 9-12, 
Comprehension about drawings reflecting Adult 


(Psychological Corp.) mechanical principles 
————— À——————— 


In addition to the pencil-and-paper tests of industrial or trade knowl- 
edge and information, performance tests for automobile mechanics, 
machine and metal trades, electrical trades, cosmetology, and trade 
dressmaking have recently been formulated by the Board of Educa- 
tion of New York City. These tests consist of a series of standardized 
work samples which are performed by the individual and observed 
by a competent examiner, The performance of the individual is ob- 
served and judged as he works, and an evaluation is made by means 
of a standardized checklist of the operations. Thus, it is possible to 
obtain not only a measurement of trade knowledge by means of in- 
formation tests, but also measurement of actual performance on 
samples of work by means of direct observation and use of an evalu- 
ative checklist for judging the competence with which an individual 
performs each of several processes involved in the work sample. 

The following job samples were included in the performance test 
for auto mechanics, For the more complex jobs, a maximum of thirty 
minutes’ working time was permitted. For the simpler jobs, a maxi- 


Evaluating Achievement in Selected Courses 285 


mum of fifteen minutes’ working time was allowed. The job samples 
included: 


1. Removing, testing, and replacing the thermostat in an 
engine. 
2. Making a compression test on the engine and recording 


the results. 

8. Tuning-up the motor by locating and changing a defective 
spark plug and setting the idle speed and idle measure. 

4. Measuring with gauge the diameter and length of each 
step in the spark plug. 

5. Setting toe-in of front w 
inches. 


heels to a specified number of 


In order to provide for objective appraisal, specific steps in each 
job sample were defined and credit ratings for observed performance 
were agreed upon. Thus, for removing, testing, and replacing the 
thermostat, the following steps and credits were used. 


NENNEN A ÉD 
Step Credit 


Drained water without spilling 
Knew where to find thermostat 1 
Test of thermostat 2 
Tightened flange bolts evenly 3 
Replaced water without spilling 1 
No water leaks 1 
Proper selection of tools 1 


tude are discussed in Chapter Seventeen. 
These aptitude tests involve not only pencil-and-paper techniques, 
but also assembling small articles such as a lock or a doorbell. 

Two publications offer many valuable suggestions on recent trends 
in test construction for the industrial arts. Micheels and Karnes (12) 
indicate how educational achievement may be measured by object 
tests, manipulative-performance tests, observational methods, and 
appraisal of products or projects. Wrightstone and others (16) report 
on pencil-and-paper and performance tests devised to measure the ef- 
fectiveness of trade education in New York City secondary schools. 
As usual, Buros (4, 5, 6) should be consulted for critical reviews of 


recent tests for the industrial arts. 


Tests of mechanical apti 


286 Evaluating Major Objectives and. Situations 


Home Economics 


In addition to the tests of industrial arts, similar tests have been 
developed for courses generally known as home economics. These tests 
are similar in format and purpose to the industrial arts tests, In Table 
16 some illustrative tests have been listed for such areas as food and 
nutrition, textiles and clothing, sewing, and household management. 

Although the early tests in home economics were mainly tests of 
facts and information, more recent tests have included measurement 
of such objectives as attitudes, appreciations, and performance. Per- 
formance in sewing and food preparation has been evaluated by use 
of checklists or score cards. 

Illustrative test items about food, clothing, and personal care are: 

Food. Of the following foods, a moderate serving of the one 
that would be used to gain most weight is 1. celery 2. 
chocolate cake 3. orange 4. cabbage 5. carrot 

Clothing. An oil stain may be removed with 1. water 2 
bleach 8. benzene 4. peroxide 

Personal Care. In order to open the pores of the skin, you 


should use 1. an ice pack 2. a hot towel 3. application 
of cold water 4. oil or ointment 


Illustrative 
TABLE 16 
Home Economics Tests 


TEST AND PUBLISHER DATE CONTENT GRADE 
Cooperative Test in Foods 1947 Marketing, menu planning, 13-16 
and Nutrition et seq. food preparation, food 
(Educational Testing values, ete, 

Service) 

Cooperative Test in 1950 Selection, use, and care of 13-16 
Household Equipment household appliances and 
(Educational Testing equipment 

Service) 

Minnesota Food Score 1946 ^ Cards for rating 57 prod- 9-16 
Cards ucts prepared in laboratory 
(Educational Testing work in foods 

Service) 

Minnesota Tests of 1952 Foods, cleaning, launder- 9-16 
Household Skills dise 


(Science Research 
Associates) 


ing, child care 


Se 


Evaluating Achievement in Selected. Courses 287 


As in the case of industrial arts, performance tests may be used in 
home economics. A performance test in dressmaking included the fol- 


lowing job samples: 


l. Preparing a tuck-in blouse for a first fitting, making right 
half of garment only. 

2. Making a series of seams such as plain seam, French seam, 
slot seam, and corded piping. 

8. Making tailored buttonholes with self-facing. 

4. Making bias binding and roll-hemming. 

5. Making machine tucks and bias facing. 


Tests and other methods of evaluation in home economics are dis- 
cussed by Brown (2) in Chapter 8 of her book. More recent trends 
in evaluating home economics are reported by Brown (3), Chadder- 
don (7), and Read (18). For critical reviews and listing of home 
economics tests, see the yearbooks by Buros (4, 5, 6). Home eco- 
nomics is a field in which new measures have appeared more abun- 


dantly in the past few years. 


Business Education 


d in business education may be classified roughly 
e involves content subjects, such as 
Economic Geography, and Business 
ludes such subjects as bookkeeping 
and accounting, shorthand, typing, and general clerical skills. For 
informational tests, the test exercises are similar to those used in tests 
of English, social studies, and spelling. The following exercises are 
illustrative of the items measuring syllabication and abbreviations in 


Subtests of a stenographic ability test. 


Syllabication 
Station 1. st-at-ion 
Abbreviations (Write the correct abbreviation for each) 


1. Free on board 
2. For example 
3. Hundred weight 
In the following Table 17, representative standardized tests have 
been listed under titles commonly found in the high-school curricu- 
lum, namely commercial arithmetic, bookkeeping and accounting, 
shorthand, typing, and general clerical work. In the skill tests, the 


: The courses offere 
into two categories. One of thes 
Business English, Business Law, 
Training. The other category inc 


2. station 8. sta-ti-on 4. sta-tion > 


288 Evaluating Major Objectives and Situations 


individual is given a series of actual job samples to perform and the 
test is scored in such a way that the individual is rewarded not only 
for accuracy in performance of bookkeeping, shorthand, typing, or 
clerical work, but also for the speed with which these job samples are 


performed. 


Business Skills Tests 


Illustrative 
TABLE 17 


TEST AND PUBLISHER DATE 


General Clerical 


General Clerical Test 
(Psychological Corp.) 


1950 


Minnesota Clerical Test 
(Psychological Corp.) 
Survey of Working Speed 
and Accuracy 
(California Test Bureau) 


1946 


1944 


Thurstone Employment 
Tests: Examination in 
Clerical Work 

(World Book Company) 


1922 


Commercial Arithmetic 


Cooperative Commercial 
Arithmetic Test 
(Educational Testing 
Service) 


Gilbert Business 
Arithmetic Test 

(Bureau of Educational 
Measurements, Kansas 
State Teachers College 
of Emporia) 


1944 


1941 


Bookkeeping and Accounting 


Elwell-Fowlkes Book- 1929 
keeping Test 

(World Book Company) 
Examination in Book- 1945 


keeping and Accounting 
(Educational Testing 
Service) 


CONTENT 


Clerical speed and accu- 
racy, numerical ability, 
verbal facility 


Checking names and 
numbers 


Number checking, code 
translation, finger dexterity, 
counting 


Computation, spelling, cod- 
ing, cancellation, classifica- 
tion 


Computation and problem 
solving 


Computation and problem 
solving 


General theory, journal, 
adjusting entries, ledger, 
statements, etc. 

Accounting terms and 
forms, analyzing and re- 
cording entries, preparing 
work sheet, etc. 


9-16, 
Adults 


9-12, 
Adults 


9-12 


9-12 


End of 
course 


10-12 


Evaluating Achievement in Selected Courses 289 


TEST AND PUBLISHER DATE CONTENT GRADE 
Stenography 
Blackstone Stenographic 1932 Dictation, ^ transcription, 10-12 
Proficiency Tests business practice, me- 
(World Book Company) chanics of English 
SRA Dictation Skills 1947 Accuracy and speed of 10-12, 
(Science Research ability to take dictation Adults 
Associates ) 
Seashore-Bennett Steno- 1946 Transcription of letters dic- Adults 


graphic Proficiency Tests tated at various speeds 


(Psychological Corp.) 
Turse-Durost Shorthand 1942 
Achievement Test 


Language skills, shorthand 10-12 
penmanship, shorthand 


(World Book Company) principles 

Typewriting 

SRA Typing Skills 1947 Accuracy and speed of typ- 10-12, 
(Science Research ing Adults 
Associates) 


Typing from handwritten 9-12, 


Thurstone Employment 1922 
and edited copy Adults 


Tests: Examination in 
Typing 


(World Book Company) 


Performance tests in business education include job samples for 
shorthand, typewriting, and bookkeeping. In a test constructed and 
used in New York City high schools for shorthand and typewriting, 
three letters were dictated, the first as a “warm-up” letter not to be 
transcribed. The rates for dictation were 64, 80, and 84 words per 
minute. The typing included setting up and typing a bill or invoice 
and the retyping of a revised rough draft of a letter. 

In bookkeeping the job samples included (a) recording on special 
bookkeeping paper a list of business transactions, (b) posting of en- 
tries, and (c) correcting errors which appeared on a trial balance. 

For teachers and supervisors of business education, an entire volume 
by Hardaway and Maier (10) on tests and measurements in the busi- 
ness subjects has been written. The authors provide numerous sug- 
gestions for the construction of teacher-made tests and a comprehensive 
review of standard tests that are available from commercial publishers. 


Summary 


The curriculum in the subjects of social studies, natural sciences, 
music, art, foreign languages. industrial arts, home economics, and 


290 Evaluating Major Objectivés and Situations 


business education may vary in objectives and content from locality 
to locality. This variation requires that the test user check the content 
of standard achievement tests in any subject against the local curricu- 
lum in order to determine whether or not the test exercises include 
a representative sample of the more specific objectives of information, 
knowledge, and skills which pupils have had an opportunity to learn. 
If the tests show curricular validity, they may appropriately be used 
to measure achievement. 

In the social studies, standard achievement tests have been con- 
structed for general social studies courses, and for the usual courses 
in history, civics, and geography. In the natural sciences, standard 
achievement tests are available for general science, biology, physics; 
and chemistry. Achievement in selected objectives of courses in music 
and art may be appraised by standard tests, but test exercises usually 
measure knowledge about principles or theory, and those aspects of 
skills and appreciation which are amenable to pencil-and-paper test- 
ing. Included among the measurable objectives represented in current 
foreign language tests will be found vocabulary, reading comprehen- 
sion, grammar, and knowledge of the cultural history and literature 
of a nation. 

Tests in the industrial arts, home economics, and business educa- 
tion include pencil-and-paper tests and performance tests. In indus- 
trial arts, the pencil-and-paper tests measure trade or technical knowl- 
edge in such courses as mechanical drawing, blueprint reading, elec- 
trical installation, and machine operation. Performance tests, consist- 
ing of a series of standard work samples, have been developed for 
such courses as auto mechanics, machine shop, and electrical shop. 
In a like manner, pencil-and-paper tests in home economics measure 
information and knowledge about food, clothing, sewing, and house- 
hold management. Performance tests measure skills and abilities in 
such courses as dressmaking, cooking, and cosmetology. In business 
education, pencil-and-paper tests are available to measure achieve- 
ment in business English, business law, and business training. Skill 
tests in which the individual is Biven a series of job samples to per- 
form with speed and accuracy are available for bookkeeping, short- 
hand, typing, and clerical work. 

Although typical, or representative, standard tests of achievement 
in each of the selected courses have been cited, it is recommended 
that the test user consult the more comprehensive listing and the de- 


tailed reviews of tests in various subjects provided in the yearbooks 
edited by Buros (4, 5, 6). 


Evaluating Achievement in Selected Courses 


12, 


18, 


. Beverley, F., “The Art Teacher and Evalua 


. Brown, Clara M., Evaluat 


- Chadderdon, H., and others, Developmen: 


. Hardaway, M., and Maier, 


. Lundin, R. W., "The Development and 


291 


Problems for Class Discussion 


Select a social studies or history test. Check the test items against your 
local course of study and report on the content validity. 

Make a job analysis of some activity in industrial arts, home economics, 
or business education and prepare a tentative rating scale of steps in 
performance for one aspect of the job analysis. 

Assume that you are planning to administer an end-of-year science test 
to a high-school class. Consult the Mental Measurements Yearbooks to 
obtain the critiques of test experts on the standard tests in science. Which 
test would you use? Why? 


References Cited in This Chapter 


tion," Art Education Today, 


1949—1950. New York: Teachers College, Columbia University, 1950. 


p. 85-91. 
ion and Investigation in Home Economics. 
New York: F. S. Crofts & Co., 1941. Chapter 8. 

Brown, S. A., Technique for Evaluating the Ability of Teachers to Apply 
Principles Concerned with the Developmental Needs of Adolescent 
Girls. Doctor's thesis. Ames: Iowa State College, 1949. 

Buros, O. K., editor, The Nineteen Forty Mental. Measurements Year- 
book. Highland Park, N. J.: The Mental Measurements Yearbook, 1941. 
Buros, O. K., editor, The Third Mental Measurements Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949. 

Buros, O. K., editor, The Fourth Mental Measurements Yearbook. High- 


land Park, N. J.: The Gryphon Press, 1958. 
t of Paper-and-Pencil Tests to 


Evaluate the Ability to Apply Generalizations in Home Economics. Des 


Moines: Iowa State Board for Vocational Education, 1947. 
Attainment Examinations in For- 


. Cheydleur, F. F., and Schenck, E. A., 
d Future: Credits vs. Achievement 


eign Languages, Past, Present, an hieve 
at the University of Wisconsin, 1931-1947. Madison: University of 


Wisconsin, Bureau of Guidance and Records, 1948. 
al Society for the Study of Educa- 


. Forty-Sixth Yearbook of the Nation 
‘American Schools.” Chicago: Uni- 


tion, Part I, “Science Education in 


versity of Chicago Press, 1947. 
T. B., Tests and Measurements in Business 


outh-Western Publishing Company, 1952. 
Validation of a Set of Musical 


Ability Tests," American Psychologist, 2:350, August, 1947. 
Micheels, W., and Karnes, M., Measuring Educational Achievement. 


New York: McGraw-Hill Book Company, 1950. 
Read, K. H., “A Situation Test Is Tested,” Journal of Home Economics, 


40:201-202, April, 1948. 


Education. Cincinnati: S 


292 Evaluating Major Objectives and Situations 


14. Wesley, E. B., Teaching the Social Studies. Boston: D. C. Heath & Co., 
1942. Chapter 6. 

15. Wrightstone, J. W., and Campbell, D. S., Social Studies and the Ameri- 
can Way of Life. Evanston, Ill.: Row, Peterson and Co., 1942. Chapter 
9. 


16. Wrightstone, J. W., and others, Measuring the Effectiveness of Instruc- 


tion in Vocational Education. Albany: University of the State of New 
York, 1951. 


References for Further Reading 


Forty-Fifth Yearbook of the National Society for the Study of Education, 
Part I, "The Measurement of Understanding." Chicago: University of 
Chicago Press, 1946. 

Suggestions for teacher-made tests are offered in all subject-matter 


fields. Many types of test exercises are present with special emphasis on 
the measurement of understanding. 


Greene, H. A., Jorgenson, A. N., and Gerberich, J. R., Measurement and 
Evaluation in the Elementary School. Second edition. New York: Long- 
mans, Green and Company, 1953. 

This volume provides 
in all subjects of the elem 
of the tests to illustrate th 
and skills. 


Greene, H. A., Jorgenson, A. N., and Gerberich, J. R., Measurement and 
Evaluation in the Secondary School. Second edition. New York: Long- 
mans, Green and Company, 1954. 

As in the case of the volume for element: 
tests in all subject areas of the secondary 
trations of representative test exercises. 


a comprehensive description of tests available 
entary school. Sample items are cited for many 
€ test exercises used to measure various abilities 


ary-school tests, description of 
school are provided with illus- 


4 7 
HAPTER sixreeN. | Evaluating Interests 


is defined in a variety of ways. According 
to John Dewey (4), “Genuine interest, in short, simply means that a 
Person has identified himself with, or has found himself in, a certain 
course of action.” Jersild and Tasch (6) emphasize that interest in- 
Volves a freely chosen activity. According to Douglas Fryer (5), “In- 
terests are the objects and activities that stimulate pleasant feeling 
in the individual.” Although interest is defined differently, there do 
not appear to be any contradictions or inconsistencies represented. 
^n analysis of these definitions reveals a common concern with per- 
Sonal feelings, objects, and activity in a situation. 


"Interest" 


portance of Interest 
commonly held by some commer- 


cial people and educators, is typified in the practice of "sugar-coating" 
activities, These people feel that it is important to catch the interest 
d hold it long enough to impart a mes- 
Sage, a piece of knowledge, or 2 skill. In the classroom, this approach 
Would imply that the teacher should use an extrinsic kind of appeal 
to make an educational task palatable. In such situations, learning is 
Considered a distasteful matter which must be sugar-coated to make 
it endurable. This conception, though condemned by Dewey at the 

eginning of the twentieth century, js still in evidence in some present- 


ay classrooms. 

b Interest must be see 
Oy needs to develop his phy 
all, hiking, and related activiti 

velopment. Similarly, the child nee 

With his peers and adults. Consequently, 

Ing activities appropriate to the child’s level 

to the child's meds, Fundamentally, therefore, 

293 


Concept and Im 


A superficial concept of interest, 


ative to needs. Since a growing 
Jf, interests like baseball, basket- 
tribute to such physical de- 
ds to know how to communicate 


nasa correl 
sical se. 


interests are significant 


294 Evaluating Major Objectives and. Situations 


insofar as they relate to the needs of students, for, when a student 
needs something, the interest (if intelligently selected) would reflect 
that need. In general, when people need something, they will try to 
satisfy their need through some course of action. 

It can readily be seen, then, that interests, growing out of student 
needs, become the motivating factors which call forth effort. Dewey 
(4) says that "It is not too much to say that a normal person demands 
a certain amount of difficulty to surmount in order that he may have 
a full and vivid sense of what he is about, and hence have a lively 
interest in what he is doing." Seen in this sense, interest and needs are 
significant in education because they are the wellsprings of effort. 
The student needs no external discipline to apply himself when his 
work is interesting and meaningful. Interest is important in the edu- 
cational process because it stimulates effort. In other words, interests 
are a means to an end. 

What is a good interest and what is a poor interest? Are interests 
universally good or bad in themselves, or are interests relative to 
persons, time, place, and culture? While some work has been done 
to discover interests related to such variables as age, sex, and vocation 
(8, 6, 7), little has been attempted in the way of estabishing stand- 
ards of interest. While the teacher may select certain activities which 
he promotes as proper interests, and discourage (or even prohibit) 
other student interests, there are no absolute standards of goodness 
which are available. Generally, either the teacher (or a group of teach- 
ers and the principal) or the pupils and the teacher together must 
establish criteria which serve as a basis for evaluating interests. The 
latter process is highly favored in educational circles, because it teaches 


students how to apply standards of their own choosing to common 
interests. 


Undoubtedly, there are those who would maintain that standards 


tor evaluating interests should come from some authoritative source 
(an absolute religion, judgments of prominent people, traditions, etc. )- 
However, others would insist that interests should be appraised in 
terms of the consequences these interests have on the person or the 
social group. Despite this variety of possibilities, for the present, the 
appraisal of interests will remain a function of people, and people’s 
values will serve as the standards, F or instance, some music teachers 
may insist that an interest in folk music is inferior to an interest in 
classical music. This problem of what standards should determine the 
goodness of an interest, though unresolved, is related in education 
to the problem of what is meant by "growth" in interests. 

"Growth" in education means an improvement in time over a pI€ 


Evaluating Interests 295 
vious status. If absolute standards were available or feasible, then the 
growth of an individual's interests could be gauged in terms of the 
extent to which they approached the desired status. If relative stand- 
ards were applied, then again, for that individual in a particular situ- 
ation or cultural setting, the degree to which his interests moved in 
the direction of desirable goals, to that extent the interests would be 
good. Depending upon circumstances, one may employ the following 
eight abstract considerations in evaluating interests: 

(1) Is the interest permanent or temporary? If permanence is de- 
sirable, then a temporary interest may be evidence of lack of growth. 
If an interest becomes permanent and a temporary state of interest is 
considered desirable (an interest in comic books), then growth would 
depend upon the transient nature of the interest. 

(2) Is the interest deep or is it superficial? Where depth is desirable, 
superficial interests would be evidence of lack of growth. 

(8) Is the interest broad or narrow in scope? At certain ages it 
may be felt that interests should be broad so that various avenues of 
self-expression may be discovered. At other times narrow interests 
may be desirable if specialization is culturally desirable. 

(4) Is the interest individual or group-centered? When it is de- 
sirable that a self-centered child Jearn how to play with others, de- 
sirable interests may be those of a group nature. Contrariwise, when 
it is desirable for a child to learn how to play by himself, individually- 
centered interests may be evidence of growth. 

(5) Is the interest active or passive? For a child who needs to de- 
velop active interests, growth in interests would be the acquisition 
of desirable active interests, while for the child who is very active 
some relatively passive interests may be desirable. 


(6) Are the interests of the child balanced? This criterion includes 


some of the other criteria, yet it is different from those specific ones. 
hich provide for his various 


Does the child engage in interests w. 
needs? 

(7) Are the interests of 
pologists have shown the 


the child moral or immoral? While anthro- 
relativity of what is moral from one cul- 


ture to another, still for any given culture some interests are considered 
immoral and others moral. Whatever the standard may be for morality, 
the interest can be evaluated in terms of that standard. 

ucive to a democratic way of life? In a 


(8) Is the interest cond 
country where the prevailing ideal is democracy, interests which have 
ces may be considered 


undemocratic characteristics or consequen 
poor, or evidence of lack of growth. 
The student is cautioned, when applying these eight considerations 


296 Evaluating Major Objectives and Situations 


in an appraisal of interests, to think through and to clarify in justifiable 
manner what interests evidence lack of growth and what interests 
show growth. Students should be careful not to fall into the trap of 
thinking that whatever exists is good or that present norms define 
adequate standards. Ultimately, what is good must be based on a 
consideration of appropriate ideals. 


The Teacher's Role with Interests 


There are at least six reasons why the teacher should be concerned 
with the interests of pupils. Briefly, these reasons are: 

(1) To Promote Desirable Interests. Each teacher, no matter what 
the age of pupils, has the obligation to promote desirable pupil in- 
terests. Since interests are basic components of living, the teacher 
who neglects them is failing to do a comprehensive job of teaching. 
Since pupils need encouragement in carrying on many desirable long- 
range interests (reading books, practicing on a musical instrument, 
developing athletic skills, etc.) one important phase of teaching should 
include the promotion of desirable interests which already exist. 

(2) To Foster New Interests. Where students have limited or nar- 
row interests, it becomes the job of the teacher to foster new ones. 
Students who come from poor home and neighborhood backgrounds 
can be provided with a wide variety of interests in school, such as 
clubs, sports, journalistic writing, dramatics, and other activities. The 
school as an educational agency serves to introduce new interests to 
boys and girls of varying backgrounds, rural or urban, rich or poor, 
native or foreign-born. 

(3) To Discourage Undesirable Interests. Although the school is 
a place where young people are prepared for living, to some extent 
it presents an idealized environment in which the young are afforded 
an opportunity to grow. Children are in a position to pick up all kinds 
of undesirable interests outside of school where society is far from 
ideal. Under such circumstances the school, through the teacher, 
should try to take the child with poor interests and discourage them 
by educationally feasible methods such as the positive substitution of 
more desirable interests. 

(4) To Develop Teacher-Pupil Rapport. The teacher is stil] con- 
sidered by the pupil as somebody who is different from ordinary 
human beings and who is unusually interested in certain school or 
academic pursuits. To this extent there may be a gulf between pupil 
and teacher which inhibits free expression by the child. If the teacher 


Evaluating Interests 297 
is to develop a cooperative relationship with his pupils, a sincere 
knowledge of their interests will aid in establishing a friendly rapport. 

(5) To Enliven the Curriculum. Since subject matter is basically 
a means for learning such fundamental things as personal discipline, 
social adjustment, self-initiative, and good work habits, as well as 
knowledge and skills, the teacher who capitalizes on the interests of 
his children by including those interests as part of the curriculum is 
well advised. The teacher must begin with the learner wherever and 
whatever his status may be and encourage the child to grow in a 
desirable direction. Through the incorporation of pupil interest in 
the daily classroom, the work of the school becomes more real, lively, 
and enjoyable. 

(6) To Provide Educational and Vocational Guidance. The teacher 
is frequently considered to be the one who will guide pupils through 
the maze of educational curricula and to help pupils decide on voca- 
tions, While interest alone is not an infallible indication of probable 
success in further schooling or in vocations, still interest is considered 
a valuable consideration in conjunction with other data in guidance. 
Teachers need to become familiar, through the use of teacher-made 
and standardized inventories and questionnaires, with the interests of 


their students. 


Evaluating Interests in the School 


In attempting to evaluate interests, one faces three broad questions. 
? Second, when should one evalu- 


First, who should evaluate interests: 
ate interests? Third, what general means can be employed to evaluate 


interests? 


Who Should Evaluate Interests? In accordance with well-accepted 
objectives in education, the public schools strive to develop, among 
others, two essential abilities in young people: (a) the ability to 
evaluate for oneself, (b) the ability to evaluate cooperatively mat- 
ters of common interest and concern. If these two abilities are to be 
developed, it follows logically that the answer to the question, “Who 
shall evaluate interests?” is twofold. In the first place, the pupil must 
learn gradually how to evaluate his own interests. Second, since the 
child is immature at any given educational level, the teacher, other 
pupils, and the parents may all share in the evaluation of interests. In 
the school situation the pupil and teacher, separately or together, 
should be responsible for the continuous evaluation of interests. 


298 Evaluating Major Objectives and Situations 


When Should One Evaluate Interests? The teacher should make 
provision for the evaluation of interests in four instances. First, at the 
beginning of a school year, a survey of student interests is invaluable 
as a means of familiarizing the teacher with the student and for pur- 
poses of describing the initial interest status of the student. Second, 
the teacher may gather evidence on the process which takes place 
when attempts are made to promote new and desirable interests and 
discourage undesirable interests. Such measures serve to illuminate 
any changes in interests or interest patterns which do take place. 
Third, the teacher should survey student interests at the end of the 
year in order to determine growth, or, in other words, desirable changes 
in interests from their initial status. Fourth, on special occasions for 
any given student or group of students, the teacher may desire to 


gather more complete information for educational or vocational guid- 
ance. 


What General Means Can Be Employed to Evaluate Interests? Since 
evaluation involves an appraisal in terms of standards or criteria based 
upon evidence, there is a need for collecting accurate and objective 
evidence of the presence of interests before these interests can be 
appraised. In general, the teacher has two techniques available for 
collecting evidence on interests, The teacher can utilize appropriate 
standardized interest tests, but he must depend to a large extent on 
what he can devise by himself or with interested colleagues. 


Teacher-Made Devices 


There is a variety of general techniques which the teacher may em- 
ploy to become familiar with student interests. Some of these are 
class discussions, student talks, interviews, checklists, questionnaires, 
rating scales, paired-comparison tests, inventories, diaries, anecdotal 
records, and logs. Naturally, some techniques are more appropriate to 
some grade levels and situations than are others. For instance, any 
written questionnaire would obviously be inappropriate for a kinder- 
garten group. In such a case, an oral interview would be one of the 
best techniques for learning about interests, For an elementary-school 
group which needs practice in writing, an essay on “Things I Like to 
Do” would be an educationally useful method. 

Although the teacher is generally not in a position to construct 
standardized and statistically valid and reliable measuring tools, it 
does not follow that the teacher may not be capable of developing 
techniques which have educational value. If a technique of gathering 


Evaluating Interests 299 


evidence on interests provides the teacher with sufficient data to help 
his students grow, it serves its function as an informal evaluation 
device. Many teachers are in a position to construct devices (with or 
without student cooperation) appropriate to their needs and level of 
teaching. Some useful approaches to the evaluation of student in- 
terests are presented below as illustrations. 

At the nursery-school level an interview with the parent is a con- 
venient source of data on the child. Responses by the parent can be 
remembered and written up later, or a simple interview schedule con- 
taining a checklist of interests can be used for immediate recording. 
The interview technique can be supplemented by observational notes 
on the types of interests characterizing each child. In working with 
the child himself, such materials as blocks, construction toys, dolls, 
puzzles, paints, clay, sand, water, and such tools as the hammer or 
Screw driver can serve to reveal interests as well as other behavior 
resulting from the child's development and needs. Data gathered 
early in the school year serve as “initial” descriptions of the status of 
a child. For “process” descriptions of child growth in interests, the 
nursery-school teacher can repeat interviews with the child and the 
parent. In addition, for increased accuracy and objectivity in conclu- 
sions on child interests, such devices as anecdotal records, logs, and 
checklists can be used to show what kinds of interests the child is 
for "end" descriptions and appraisal of interests, 
the teacher may use any one technique or a combination of techniques 
already mentioned, preferably the latter, thereby appraising the range 
and quality of new desirable interests which the nursery-school child 
has acquired during the year. A report to the parents on growth in 
interests can be oral written, or both. A written record should be 


kept of the child's interests. 

At the elementary-school 
techniques already enumerated 
the variety and complexity of 
elementary pupils are able to rea 
ties to engage in a large number 


tical techniques a teacher can emp 
tory appropriate to his own group is to have the student write on sep- 


arate slips of paper each activity he now enjoys. If the student can't 
write, the teacher can ask the student and then record each interest 
on separate slips of paper. After collecting all the slips the teacher 
can discard repetitions and then group * similar interests on an inven- 


r breakdowns see discussion on pages 801-302, 


pursuing. Similarly, 


level the teacher can employ any of the 
for the nursery school. In addition, 
the devices are greater since upper- 
d and write and have had opportuni- 
of activities. One of the most prac- 
loy to construct an interest inven- 


1 For various types of grouping ©) 


300 Evaluating Major Objectives and. Situations 


tory or checklist. A suggestive illustration is the Ohio Interest In- 
ventory for the Intermediate Grades. This inventory is scored for 
eighteen areas: Sports, School, Dramatics, Home Activities, Leader- 
ship, Science, English, Industrial Arts, Helping Others, Health, Mov- 
ies, Doing Things Alone, Social Science, Mathematics, Music, Fine 
Arts, Reading, and Radio. The 360 items are arranged in groups of 
five similar activities so that the child is not made overly conscious of 
item concentration. Also, scoring each area becomes very efficient. 


Some items in the inventory are reproduced below to show their 
arrangement. 


To play baseball To watch football games 
To take gym To watch basketball games 
To march in the gym To climb trees 

To do stunts 


To know sports stars 
To play relay games To read stories about sports 
To hear sports broadcasts 
To see movies about sports 
To go swimming 

To jump rope 

To go hiking 


To go bicycling 

To roller skate 

To go ice-skating 

To play hopscotch 

To play games in snow 


Because of the absence of any research data on validity, reliability; 
and norms, this inventory is classified as a teacher-made device. The 
manual which accompanies the test suggests various ways in which the 
inventory may be educationally profitable. 

On the junior- and senior-high-school levels, where teachers are 
generally specialists in one or two subjects, there is a need for becom- 
ing familiar with likes and dislikes of students for various subjects or 
subdivisions thereof. While the homeroom teacher may be able to 
find time to engage in general interest evaluation, it is the subject- 
matter specialist who is desirous of tapping adolescent effort by or- 
ganizing units about pupil interests. Also, in attempting to plan 4 
program for a longer period of time than a month, the subject-matter 
teacher may wish to use a questionnaire or checklist. An illustration 
of a part of a teacher-made "Musical Interest and Skills Question- 
naire" is presented below. 

This device was designed to discover Which pupils in a music class 
had either passive or active interest in music. In addition, the items 
were scored to reveal whether students were interested in singin 
2 Issued by Ohio Scholarship Tests and El 


of Education, Columbus, Ohio. Develo 
in cooperation with the College of Ed 


ementary Supervision, State Department 
ped by the Euclid Elementary Teache 
ucation, Ohio State University. 


Evaluating Interests 301 
PUMA music, or other musical activity. Notice, too, that the 
ourth category of response, "would like some opportunity, may be 
a fine guide for the teacher who is trying to enlarge student interests. 
A questionnaire of this kind can be answered by the student on a 


separate answer sheet to facilitate scoring. 


MUSICAL SKILLS AND INTERESTS QUESTIONNAIRE 


Directions: The following questionnaire is devoted entirely to the musical 
interests which have been shown by high-school students. We want you 
to indicate whether you like an activity, are indifferent to it, dislike the 
activity, or would like some opportunity to do it. On your separate answer 
sheet beside each number indicate the letter which describes your feeling. 


L means like 
I means indifferent 


D means dislike 
O means would like some opportunity 


11. To play the violin. 


l. To go to concerts. 

2. To listen to the violin. 12. To hear operatic arias. 

8. To play the oboe. 13. To sing when you are alone. 

4. To sing in a glee club or chorus. 14. To accompany a glee club or 

5. To play popular music. chorus. 

6. To listen to string ensembles. 15. To hear organ music. 

7. To discuss the relative merits 16. To attend musical comedies. 
and demerits of various con- 17. To sing in a school music festi- 
ductors of symphony orchestras. val. 

8. To play the flute. 18. To play in a woodwind en- 

9. To listen to dance music on the semble. 

19. To hear songs in foreign lan- 


radio. 
10. To attend movies with a good 
musical background. 


guages 
20. To take music lessons. 
or appraising interests, the teacher may de- 
h possible categories or types of items to 
include. Of course, the purpose of the instrument and the basic edu- 
cational situation would determine the categories. The following pos- 
sible classifications of interests are offered. In a sense, one may wish 
to refer to these classifications as dimensions for measuring interests. 
While there may be many more possibilities, or more important ones, 
these ten dimensions should be suggestive. 


_In constructing devices f 
Sire to become familiar wit 


l. Type of Participation. Active or passive. 

2, Nature of Participation. Individual, group, or both. 
8. Direction. Personal or social. 
4. Content. Vocational or avocational. 


302 Evaluating Major Objectives and Situations 


5. Place. Outdoor or indoor. 

6. School Subjects. Chemistry, English, history, etc. 

7. School or Life Areas. Scientific, literary, artistic, etc. 

8. Type. Human versus inanimate and non-human objects. 

9. Past Experience. Interests were experienced or not. 

0. Enjoyment. Like or don't like; or other degrees of ap- 
preciation. 


In surveying this section on teacher-made instruments or procedures 
for collecting and evaluating interests, it is proper to point out that 
the basic assumption underlying this presentation is that classes vary 
so greatly from school to school and in some cases within schools that 
it is not possible to hand out ready-made devices to teachers. Teach- 
ers may well cooperate with others one or two grades below and above 
them in order to construct with maximum efficiency instruments ap- 
propriate to their children. In discussing teacher-made devices one 
must indicate that their greatest value lies in flexibility of use and 
adaptability to the work of a particular class or school. Teachers, no 
matter what their grade level, can try out techniques on a preliminary 


basis and develop appropriate means to become familiar with, and to 
appraise, student interests, 


Standardized Devices 


There are few valid and reliable standardized interest devices avail- 
able to teachers at various grade levels to help them evaluate the 
growth of the interests of their students. While one may understand the 
absence of such standardized devices on numerous grounds, such as 
the complexity of encompassing the wide range of human interests 
into manageable categories and standardizing them for various fac- 
tors such as age, sex, residence, etc., still one must not neglect the 
fact that interest measurement was not begun until the 1920’s. 

Until very recently, the field of interest measurement has remained 
relatively unexplored. Even at the present time, however, the major 
emphasis in interest measurement is on vocational guidance not class- 
room or educational guidance, ; 

The elementary-school classroom teacher will find virtually n9 
standardized tests available to help diagnose and evaluate child iP- 
terests. For the high-school teacher the m 
from standardized interest inventories is fr 
tional interest scales. For the convenience 
summarizing school and vocational interest 


ain value to be derive 
om the few useful voc? 
of the teacher, Table 18 
measures is presented. 


Evaluating Interests 


Illustrati 
TABLE T8 lustrative 


TEST AND PUBLISHER 


Brainard Occupational 
Preference Inventory 
(Psychological Corp.) 


Cleeton Vocational 
Interest Inventory 
(McKnight and 
McKnight) 


Interest Index: General 
Education Series 
(Educational Testing 
Service) 


Kuder Preference 
Record— Vocational 
(Science Research 
Associates) 


Kuder Preference 
Record Personal 
(Science Research 
Associates) 


Occupational Interest 
Inventory—Intermediate 
and Advanced 
(California Test Bureau) 


Primary Business 
Interests Test 
(Science Research 
Associates) 


Thurstone Interest 
Schedule 
(Psychological Corp.) 


Vocational Interest 
Analysis 
(California Test Bureau) 


DATE 
1945 


1943 


1950 


1948 


1948 


1944 


1942 


1947 


1958 


School and Vocational Interest Inventories 


ASPECIS MEASURED 


Commercial, personal serv- 
ice, agricultural, mechani- 
cal, professional, esthetic, 
scientific 


Mechanical, scientific, artis- 
tic, etc. 


Sports, science, literature, 
etc. 


Mechanical, computational, 
scientific, persuasive, artis- 
tic, musical, literary, cleri- 
cal 

Working with ideas, being 
active in groups, avoiding 
conflicts, directing others, 
being in familiar and stable 
situations 


Personal-social, natural, 
mechanical, business, the 
arts, the sciences 


Accounting, collections, 
sales-office, sales-store, 
stenographic filing 


Physical science, biological 
science, computational, 
business, executive, persua- 
sive, linguistic, humanitar- 
ian, artistic, musical 

Personal-social, natural, 


mechanical, business, the 
arts, the sciences 


303 


GRADE 


9-16, 
Adults 


9-16, 
Adults 


7-13 


9-16, 
Adults 


9-16, 
Adults 


7-12, 
9-16, 
Adults 


9-16, 
Adults 


9-16, 
Adults 


9-16, 
Adults 


304 Evaluating Major Objectives and. Situations 


TEST AND PUBLISHER DATE ASPECTS MEASURED GRADE 
Vocational Interest 1947 Scoring keys for interests. 12-16, 
Blank—Men and Women similar to artist, dentist, Adults 
(Stanford University farmer, etc. (47 occupa- 

Press) tional keys for men, 28 for 
women) 

What I Like to Do 1954 Art, music, social studies, 4-7 

(Science Research active play, quiet play, 

Associates ) manual arts, home arts, 


science 


In general, for the present, the most convenient device that the 
high-school teacher can employ without spending too much time 
and money is the Kuder Preference Record. The Kuder Preference 
Record is a group test, easily administered and scored, which yields 
Scores in nine different areas: ( 1) mechanical, (2) computational, 
(3) scientific, (4) persuasive, (5) artistic, (6) literary, (7) musical, 
(8) social service, and (9) clerical. A classification of selected occupa- 
tions under each general area is provided to help the guidance person 
find appropriate levels and kinds of occupations. 

The Kuder Preference Record is designed so that the individual 
indicates the activity he prefers most and the activity he prefers least 


out of a group of items. Nine sample items out of a total of approxi- 
mately 500 drawn from Form C are as follows: 


g. Draw a comic strip 

h. Write advertising for electrical appliances 

j. Operate a truck farm 

k. Experiment with making some candy for which you 
didn't have the recipe 

Tell stories to children 

m. Paint water colors 

n. Do chemical research 

p. Interview applicants for employment 

q. Write feature stories for a newspaper 


m" 


Experienced counselors would be more inclined to use a combina- 
tion of devices, but for the regular classroom teacher, for whom voca- 
tional guidance is a job to be done in little time at little expense, the 
Kuder Preference Record or similar inventories are appropriate. 

The Maller-Glaser Interest-Values Inventory can be used as a means 
of providing guidance for high-school students in matters of curricu- 
lum and adjustment. The inventory is self-administering and takes 


Evaluating Interests 305 
less than thirty minutes. It indicates a measure of the relative domi- 
nance of four major types of interest: theoretic, aesthetic, social, and 
economie. In addition, data are provided on such personal material 
as security, social appraisal, sense of achievement, and sense of be- 
longing. 

Another convenient and inexpensive interest inventory which might 
be used by the classroom teacher is the Cleeton Vocational Interest 
Inventory. There are separate blanks for men and women. In all, 
there are 700 items, 70 for each of ten fields of work. The administra- 
tion and scoring are relatively easy, but, as in most standardized tests, 
the teacher must be careful to interpret the results properly. For more 
complete information about these devices, teachers are advised to 
order specimen sets and read them critically. Then, to check on his 
own judgment, the teacher might examine the reviews in Buros (1, 2). 

It might be said that the paucity of standardized tests for evalu- 
ating other than vocational interests compels the teacher to depend 
upon self-made devices or to neglect this area of testing. However, for 
vocational testing and academic guidance the high-school teacher has 
a few possibilities which, though not adequately validated, provide 


important information. 


Summary 

hosen enjoyable activities which 
It seems to be true that when 
er, harder, and more effectively 
Although there are no lists of 


Interest is considered to be freely c 
basically relate to needs and drives. 
Pupils are interested they work long 
under self-discipline than otherwise. 
absolute standards which define a good interest or a bad interest, 
daily behavior of teachers and parents reveals the presence of stand- 
ards. Since standards appear to be rclative to the child and the situa- 
tion, evaluation of interests in terms of standards needs to be done 
Continuously by those involved in the educational situation. Growth 
is defined as the acquisition of new interests or the development of 
desirable interests and the loss or avoidance of undesirable interests. 

The teacher concerned with interests may (1) promote desirable 
interests among students, (2) foster new interests, (3) discourage un- 
desirable interests, (4) use interests to develop increasing rapport be- 
tween himself and the pupils, (5) use interests to motivate pupils in 
the curriculum, and (6) use interests in educational and vocational 
guidance. A continuous record of pupil interests should be kept in 


Order to promote these ends. 


306 Evaluating Major Objectives and. Situations 


Interests should be evaluated continuously and cooperatively by 
pupils, teachers, and parents. This evaluation should include initial 
measures at the beginning of the term, process measures during the 
term, and final measures at the end of the school year. Also, for spe- 
cial purposes, the interests of any pupil or group of pupils may be 
studied at any time. In the absence of standardized measures the 
teacher may construct his own from such techniques as class discus- 
sion, student talks, interviews, checklists, questionnaires, rating scales, 
paired-comparison tests, inventories, diaries, anecdotal records, and 
logs. Dimensions for analyzing interests are presented. The main 
value of standardized devices for the teacher in the field of interests 
is to provide additional data for vocational guidance. 


Problems for Class Discussion 


1, Administer a standardized interest inventory to several pupils and inter- 
pret the results for each pupil. 


2. Analyze the responses of a class to the topic: “What I like about tele- 
vision.” Determine what interests and needs are met by this activity. 


8. Construct an interest inventory for children of the grade level or subject 
you teach or are preparing to teach. Administer the inventory to a class 
and determine which interests might be used for class projects or activ- 
ities. Which interests are mainly of an individual nature? 


References Cited in This Chapter 


1. Buros, O. K., editor, The Third Mental Measurements Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949. 

2. Buros, O. K., editor, The Fourth Mental Measurements Yearbook. High- 
land Park, N. J.: Gryphon Press, 1953. 

8. Carter, H. D., Vocational Interests and Job Orientation. Applied Psy- 
chology Monographs No. 2, American Association for Applied Psychol- 
ogy. Stanford, California: Stanford University Press, May, 1944. 


4. Dewey, John, Interest and Effort in Education. New York: Houghton 
Mifflin Co., 1918. p. 48, 52. 


5. Fryer, Douglas, The Measurement of Interests. New York: Henry Holt 
and Co., 1931. p. 15, 345. 


6. Jersild, A. T., and Tasch, R. J., Children’s Interests and What They 
Suggest for Education. New York: Teachers College, Columbia Univer- 
sity, 1949. p. 2. 

T. Strong, E. K., Jr, Change of Interests with Age. Stanford, California: 
Stanford University Press, 1931. 


Evaluating Interests d 


References for Further Reading 
R. J., Children’s Interests and What They Suggest 
hers College, Columbia University, 1949. 


n a concrete and practical way the 
lanning the day-by-day activ- 


Jersild, A. T., and Tasch, 
for Education. New York: Teac 
The authors interpret and translate i 
role that interests of children may play in p 
ities and curriculum of the classroom. 
Super, Donald E., Appraising Vocational Fitness. New York: Ha 
Brothers, 1949. 
In this volume, Chapters 16, 17, and 18 present an authoritative state- 
ment about the nature of vocational interests and various inventories and 
scales devised to measure vocational interests. For a brief survey of this 


field, this reference is excellent reading. 


rper & 


CHAPTER SEVENTEEN | Evaluating Aptitudes 


Most persons would like to gaze into the crystal ball 
of the future and by some magic insight gained thereby find the 
occupations and activities that most precisely fit their potential abili- 
ties, attitudes, interests, and skills. Aptitude tests are used by the psy- 
chologist to provide a more reliable and valid prediction. So complex 
is the pattern of psychological functioning required in any situation, 
however, that prediction through aptitude tests must be based on 
piecemeal evidence. 

Frequently, one hears people raise questions about aptitudes. Would 
it be wise to give Mary, age eight, piano lessons? Johnny has bee? 
studying violin for two years without much success; should he con- 
tinue? Look at this drawing Louis made; do you think he should be- 
come an artist? Janet is very slow in her work in kindergarten; do yo" 
think she will be able to do first-grade reading successfully? These 
and similar important decisions have to be made daily by people. 
knowledge of aptitudes may make such decisions more intelligent: 
College officers ask: Does the prospective student have aptitude for 
engineering or law or medicine? Business and industrial firms want 
to know about prospective employees: Do they have mechanical apti- 
tude? Military services ask similar questions about aptitude for ait- 
plane pilot training, radio technician, and other specialized jobs. 


The Nature of Aptitudes 


In the Dictionary of Education (4) aptitude is described as a "PI 
nounced innate capacity for or ability in a given line of endeavor 
such as a particular art, school subject, or vocation.” In the same V9 
ume, capacity is defined as "the potentiality of a person for a given 
function as conditioned by the total pattern of causes; partly heredi- 
tary and partly environmental.” Ability is defined as “the actual pow 

308 


Evaluating Aptitudes 309 
present in an organism to carry to completion any given act or to make 
adjustments successfully." According to H. C. Warren (11) in the 
Dictionary of Psychology, aptitude is *A condition or set of charac- 
teristics regarded as symptomatic of an individual's ability to acquire 
with training some (usually specified) knowledge, skill, or set of 
responses, such as the ability to speak a language, to produce 
musie «. ;" 

An analysis of these definitions reveals the use of seemingly con- 
flicting terms like natural or acquired, innate capacity or ability, po- 
tentiality or actual achievement, and hereditary or environmental. 
These definitions and terms necessitate a discussion of four funda- 
mental considerations in defining the nature of aptitudes. These four 


questions are as follows: 


l. Are aptitudes innate or acquired? 

2. Are aptitudes unitary or pluralistic? 

8. Are aptitudes constants or variables? 

4, Are aptitudes distributed “normally” or multifariously? 


Are Aptitudes Innate or Acquired? There are probably no respon- 
sible educators who would deny that heredity plays a significant role 
in defining the limits of potentiality of an individual. At the beginning 
of the twentieth century most of the authorities in educational psy- 
chology supported the theory that “intelligence” and other aptitudes 
were inherited and that the environment played a very minor role 
in their development. In the past two decades, however, as a result of 
theory and research, educational psychologists have moved in the 
direction of according “nurture” significant credit for developing and 
improving aptitudes. Current research supports the thesis that both 
factors interact and contribute to the development of aptitude. 


Are Aptitudes Unitary or Pluralistic? Historically, educators have 
behaved as though an aptitude was unitary, that is, a function of a 


single general trait or characteristic. This belief is reflected in the 
practice which is still not uncommon, of using the score on an intelli- 
gence test as an indication of the total academic ability of a person. 


In recent decades, the application of factor analysis enabled research 
workers to identify the special, independent basic aptitudes and abili- 
ties that may be the component parts of a more comprehensive activity, 
task, occupation, or psychological function. Modern research sup- 
ports the thesis that aptitudes are pluralistic rather than unitary. In 
an early study, for example, Kelley (7) identified the following seven 
factors: verbal, numerical, spatial, motor, musical, social, and me- 


310 Evaluating Major Objectives and Situations 


chanical. Thurstone (10), basing his work on more extensive factor 
analyses of tests, has thus far identified the following relatively inde- 
pendent factors: verbal fluency, number, memory, spatial, reasoning, 
deduction, and induction. 

The extensive factor analysis made by the United States Employ- 
ment Service under the direction of Shartle (8) identified eleven basic 
aptitudes, most of them similar to the Thurstone analysis, plus psy- 
chomotor factors not included in previous test analyses. The compre- 
hensive analysis made by the Army Air Forces and reported by 


Guilford (5) resulted in the identification of twenty-eight primary 
abilities. 


Are Aptitudes Constants or Variables? Whether or not one is born 
with an aptitude or whether aptitudes are unitary or pluralistic, an- 
other controversial point involves the question: Are aptitudes constants 
or variables? This argument has appeared more popularly in the con- 
troversy surrounding the stability of the I.Q. The basic question seems 
to be whether one's academic aptitude can be raised as a result of 
education or environmental stimulation. It is interesting to note that 
practically nobody argues the question whether intelligence is subject 
to decline or to change downwards. But is it possible for aptitudes to 
change? So long as one assumes the stability of the I.Q., or any other 
index of an aptitude, one may explain away unusual cases of instability 
by referring to unusual social, personal, or psychological circum- 
stances, or to the unreliability of the test or the test administratio? 
While the evidence is conflicting, the trend seems to be in the direc 
tion of assuming that aptitudes are somewhat variable, and are af- 
fected within limits by educational and environmental ‘influences. A 
tremendous amount of research is still needed to clarify this issue. 


Are Aptitudes Distributed “Normally” or Multifariously? It is a cO" 
monly accepted idea that “intelligence,” as well as other aptitudes, is 
normally distributed. By a “normal” distribution one means that very 
few people have extremes of goodness or poorness and that most 
people cluster about the average. The normal curve is as follows: 


an 


Evaluating Aptitudes 311 
The per cent figures recorded between the lines under the curve repre- 
sent the approximate percentage of people who obtain scores within 
one, two, or three standard deviations from the mean. The findings 
of various scientists that height, weight, and other physical measure- 
al curve encouraged psychologists, who 
ful results in the social sciences through 
hods, including statistics, to apply the 
normal curve to psychological data. Thus, Hull (6), in his classic book 
on aptitude testing, indicated that, because the bell-shaped distribu- 
tion is so characteristic of all forms of human behavior, it should be 
considered at least approximately true in the case of any aptitude 
unless there is definite evidence to the contrary. Most present-day 


workers in the field will accept this point of view. 


ments approximate a norm 
desired to achieve as success 
the application of scientific metl 


The Place of Aptitudes in Education 


main concerns that the school should have 


There are at least five 
first place, the school should be 


with reference to aptitudes. In the 
responsible for identifying the general or special aptitudes of each 


pupil. Second, assuming that some aptitudes may be the result more 
of nature than of nurture, the school should then encourage the 
Process of maturation. Third, assuming that some aptitudes may be 
more the result of nurture than of nature, the school should develop 
and encourage such aptitudes. Fourth, the school should utilize its 
knowledge of student aptitudes in educational guidance. Fifth, the 
school should avail itself of its knowledge of student aptitudes in voca- 
tional guidance. More specifically, what is involved in each of these 
five major concerns of the schools in pupil aptitudes? 


Identification of Aptitudes By the time a child enters nursery 
school or kindergarten, he has been subjected to at least three or four 
years of a life filled with all kinds of experiences. Children in the 
Public schools come from different types of homes, communities, and 
Socio-economic classes. Within these homes, communities, and socio- 
economie classes, various types of ideals, goals, and practices are 
countenanced or discouraged. Out of these various personal and ma- 
terial influences interacting with the child, a wide variety of talents 
and abilities arise. (“Talent” is used to signify capacity when its pres- 
ence is not accounted for by environmental influences or education. 
“Ability” refers to capacities partly accounted for by environmental or 
educational influences.) Sometimes these aptitudes seem to be prod- 


312 Evaluating Major Objectives and Situations 


ucts of nature, at other times consequences of chance or deliberate 
education. Whatever the circumstances, the child possesses at various 
stages in his development a number of aptitudes. The school needs to 
identify these capacities in order to provide the child with the educa- 
tion best suited to his needs and the social welfare. 

What capacities should be identified? In general, depending upon 
school means and curriculum, some attention should be paid to in- 
tellectual, mechanical, artistic, social, athletic, and occupational apti- 
tudes. Each of these should be evaluated during those times when 
most pertinent. For instance, vocational aptitude tests, inappropriate 
at the kindergarten level, would be useful at the junior- or senior-high- 
school level. On the other hand, mental capacity to learn to read and 
write should be measured as early as feasible after school entrance 
in order to guide the teacher in how to teach each child and in de- 
ciding what to expect from each child. The identification of aptitudes 
is a necessary prerequisite to education. 


ry classrooms are music, paint- 
ative expression in language arts. 


Develop and Encourage Pupil Abilities The public school has 4 
responsibility to develop and encourage certain pupil aptitudes. Basic 


to the elementary-school program are writing, reading, arithmetic, and 
social adjustment skills and 


Provide Educational Guidance Teachers and guidance specialists 
in the school have a continuous obligation to guide students in their 


educational progress. Such guidance may involve selecting students 
properly for individual or 


sured position for the 
decisions they make, or help students and parents make, for purposes 


of promotion or classification. Readiness tests, "intelligence" tests, a” 
survey achievement tests are the most frequently used instruments f0" 


Evaluating Aptitudes 313 
helping teachers provide such guidance on the elementary level. On 
the junior- and senior-high-school levels, achievement, "intelligence," 
and special vocational aptitude tests are used to advise students on 
Whether to follow academic, commercial, or other types of education 


programs. 


Provide Vocational Guidance An increasing use of aptitude tests at 
the junior- and senior-high-school level is being made in order to pro- 
vide vocational guidance. Youth who are ready to leave school to go 
to work at ages 16 to 18 are given various tests to help identify apti- 
tudes. On the basis of these test scores and other relevant data, 
students are advised or are given a choice of vocational avenues. For 
those students going on to college, various college aptitude tests are 
Ziven to advise students on the type of college curriculum which may 


be best suited for them. 


The Evaluation of Aptitudes 


The question of how aptitudes should be evaluated involves several 
Considerations: (a) Who should evaluate aptitudes in the schools? 
(b) When should aptitudes be evaluated? (c) How or by what means 


Should aptitudes be evaluated? 


Who Should Evaluate Aptitudes? In school practice, there are vari- 
Ous persons or groups who are responsible for administering, scoring, 
and interpreting aptitude tests. The psychometrist, school psychologist, 
School counselor, school supervisor; teacher, or public and private 
testing agencies are all persons and groups who either share in the 


Process or carry on exclusively the aptitude testing in a school system. 
In general, it may be said that individual tests (that is, tests admin- 
istered to one individual at a time, like the Stanford-Binet, ete.) are 
Usually administered by a trained psychologist or psychometrist. In 
of psychologists is usually responsible 


larger i aff 
city school systems a st i nsibl 
for individual testing of aptitudes and sometimes for group testing in 


Classrooms, In smaller school systems there may be a visiting school 
Psychologi es individual aptitude testing. — 
ologist who handles be various agencies, public or private, 


In lar: iti ma s 
Which offer et e or E: a small fee to those seeking to take apti- 
tude tests, Many universities have clinics attached [s their psychology 

epartments which offer parents the opportunity to wn their cesis 
tested. Also, there may be various social agencies such as a Catholic 
Guidance Bureau or a Jewish Vocational Guidance Bureau which offer 


314 Evaluating Major Objectives and Situations 


aptitude testing services. In most large cities, there are private voca- 
tional counseling and placement firms which provide a battery of 
tests (including aptitude) to fee-paying clients. One must learn 
whether bureaus offering such services have staffs adequately trained 
to administer and interpret aptitude tests before recommending such 
agencies to parents or pupils. 

In a school system, the teacher as guide should have a major respon- 
sibility in interpreting individually administered aptitude test scores. 
In addition, the teacher should be in a position to administer and in- 
terpret any appropriate group tests which his class may need. In this 
way, on the basis of aptitude test scores, achievement test data, class- 
room observation, case and cumulative data, and other records on the 
child, including a knowledge of home and family background, the 


teacher can help the child to progress in school and to make wise 
educational and vocational choices. 


When Should Aptitudes Be Evaluated? Since one of the main pur- 
poses of administering an aptitude test is to find out how well a child 
can be expected to perform in any given experience or subject, such 
tests should be given at any time a teacher needs evidence concerning 
aptitudes. Also, when a pupil is under special study, 


aptitude tests 
should be administered so long as the pupil's condition permits their 
use. 


Beginning with the lower grades, there are various instances when 
it is necessary to make decision 
Children who are slightly under 
permitted to take an individual 
kindergarten or first grade. Afte 
generally given to children at som 
ness tests in reading and arithme 
the first grade, or thereafter as 
being taught. 

In large school systems, a grou 
peated about every two or three 


s on pupil selection and grouping. 
age chronological are sometimes 
intelligence test for entrance into 
r school entrance, group tests are 
e time during the first grade. Readi- 
tic are generally administered during 
appropriate to the type of children 


Evaluating Aptitudes 315 


Two other types of aptitude testing take place at the junior- and 
senior-high-school levels. From the seventh grade to the twelfth, apti- 
tude tests in such special subjects as algebra, stenography, geometry, 
or languages, may be given for purposes of educational guidance. 
Similarly, beginning with the seventh grade, vocational aptitude tests 
may be administered individually or to groups of children with certain 
interests and abilities. 

Sometimes, the question is raised “When is the best time during the 
year to give aptitude tests?" There are no necessary advantages in any 
one period. Some administrators and teachers prefer to administer 
aptitude tests at the beginning of a term. Early testing provides the 
teacher with a means for the diagnosis of the child, with an intelligent 
basis for setting goals for the child, and with aid in guiding the child. 
Also, teachers do not become sensitive if their students score low on a 
test, since they do not feel responsible for the achievement of a new 
group. However, similar advantages are derived if a test is given at the 
end of a year, because the new teacher will readily have the scores 
and can make his plans accordingly. In this case, unfortunately, the 
teacher who has had the group all year may, if an insecure person, try 
to teach the children to make high scores on the aptitude test. 


How Should Aptitudes Be Evaluated? Evaluating aptitude is so 
technical a task that educators generally rely on standardized tests for 
evidence. While it is possible for teachers to construct tests or work 
sample exercises which may be useful in educational diagnosis and 
Prognosis, most teachers are either insufficiently trained or lack the 
time to employ their own techniques. Consequently, the general 
recommendation is that teachers should, in consultation with col- 
leagues and reference works like Buros Mental Measurement Year- 
books (1, 2, 3), select tests appropriate to their purposes and students. 

Aptitude tests are a major method for the diagnosis and evaluation 
of aptitudes, but they are only one way of obtaining the comprehensive 
information that is needed for guiding an individual. Other methods, 
Such as the interview, questionnaire, rating scale, anecdotal record, 
essay, autobiography, and cumulative record provide essential data. 
Super (9) discusses these methods and their role in appraising voca- 
tional fitness. 

Most aptitude tests are limited to the measurement of the intellectual 
and manipulative abilities and skills which correspond with success- 
ful achievement in a study or work situation. But successful achieve- 
Ment in a total situation involves attitudes, interests, motivation, and 


316 Evaluating Major Objectives and Situations 


personal-social adaptability as these interact in situations with intel- 
lectual and manipulative abilities and skills. 

In elementary and high schools, intelligence tests can predict scho- 
lastic achievement with a fair degree of success. In colleges, scholastic 
aptitude tests can measure potential achievement in college courses. 
However, the prediction based on these tests does not assess attitudes, 
interests, motivation, and personal-social adaptability. As a result, 
Joe Smith who rates above the 90th percentile in scholastic aptitude 
may be failing in his studies because of poor attitudes and motivation, 
lack of interest, or emotionally disturbing problems of his personal or 
social life. 

Fortunately, a high proportion of individuals have the appropriate 
attitudes and adjustments which permit them to function approxi- 
mately at the level predicted by the aptitude tests, Thus, these tests 
are used as an important part of the evidence in determining and sub- 
sequently guiding the growth of the individual. 


Types of Aptitude Tests 
INTELLIGENCE TESTS 


In order to evaluate academic aptitude, 


a variety of intelligence, or 
mental ability, 


tests have been constructed. The broadest classification 
of these tests of mental ability is according to the method of admin- 
istration—namely, individual tests or group tests. In addition to these 
categories, there are other distinctions such as performance tests and 
verbal tests, language and non-language tests. Space does not permit 
a detailed description of all these types or of all the differences which 
exist, 

The individual test of mental ability is used principally in clinical 
situations by an examiner who has extensive training in the administra- 
tion and interpretation of such tests. The most widely used of these 
individual tests are the Stanford-Binet Test of Intelligence and the 
Wechsler-Bellevue Intelligence Scale. The Wechsler Scale has one 
form adapted for children and is used generally for children from 
the ages of 4 to 13. The original Wechsler-Bellevue Intelligence Scale, 
Form I, has been revised and is now called the Wechsler Adult Intel- 
ligence Scale. The Binet test and the Wechsler tests use both verbal 
and performance exercises in order to estimate the intelligence of the 
individual. The tests must be administered orally in large part and 
the responses of the examinee recorded by the examiner. The ad- 
vantage of the individual test of intelligence is that it permits 


Evaluating Aptitudes 317 


interpretation of the responses in clinical terms and provides the op- 
portunity for observing more carefully the behavior, interest, adjust- 
ment, motivation, and working habits of the individual, in addition 
to and in relation with his general intelligence. 

The group tests of intelligence, on the other hand, are administered 
to an entire class or group of pupils or individuals, and prepared re- 
Sponses to the questions or items are checked by the examinee. This 
permits objective scoring of the group tests of intelligence. The group 
test may be illustrated by such measures as are listed in Table 19. 


Illustrative 
TABLE 19 
Intelligence Tests 
TEST PUBLISHER DATE LEVEL 
ACE Psychological Exam- Educational Testing 1953 Grades 
ination for High School Service 9-12 
Students 
Army General Classification Science Research 1947 Grades 
Test, First Civilian Edition Associates 9-16, 
Adults 
California Test of Mental California Test 1951 Grades 
aturity Bureau Kg-1, 
1-3, 
4-8, 
7-10, 
9-16 
Chicago Tests of Primary Science Research 1947 Ages 
Mental Abilities Associates 11-17 
Cooperative School and Educational Testing 1955 Grades 
College Ability Tests Service 7-14 
Davis-Eells Games World Book 1952 Grades 
Company 1-2, 
3-6 
Henmon-Nelson Test of Houghton Mifflin 1985 Grades 
Mental Ability Company 3a 
18-16 
Kuhlmann-Anderson Intel- Personnel Press 1952 Grades 


igence Tests 7-8 


318 Evaluating Major Objectives and Situations 


TEST PUBLISHER DATE LEVEL 
Lorge-Thorndike Intelligence Houghton Mifflin 1954 "xw 
"Tests Company 23, > 

4-6, 
7-9, 
10-12 
Ohio State University Psy- Science Research 1950 Grades 
chological Test Associates 9-16 
Otis Quick-Scoring Mental World Book 1954 Grades 
Ability Tests Company "ug 
9-16 
Pintner General Ability World Book 1946 Grades 
Tests: Verbal Series Company iss 
4-9, 
9-12 
SRA Primary Mental Abil- Science Research 1948 Ages 
ities Associates 5-7, 
7-11, 
11-17 
SRA Non-Verbal Form Science Research 1947 Grades 
Associates 9-12 
Terman-McNemar Test of World Book 1942 Grades 
Mental Ability Company 7-12 


In addition to such verbal tests, there are performance tests of men- 
tal ability. These tests are generally used to measure pupils who are 
unable to read or write English, such as pre-school children, children 
with speech defects, deaf children, and others whose language is 
limited. These tests may be illustrated by the Arthur Point Scale, 
Goodenough Draw-A-Man Scale, Pintner-Patterson Scale of Perform- 
ance Tests, and the subtests of the Wechsler-Bellevue Scale dealing 
with performance. These tests have some disadvantages or limitations 
compared with the more verbal types of test. Most performance tests 
do not have as high reliability as verbal 
gence measured by performance tests are 
with those measured by verbal tests, Corre 


.65 are generally found between perform 
gence tests, 


not completely comparable 
lations of the order of .55 to 
ance and the verbal intelli- 


ACADEMIC APTITUDE TEST ITEMS 


Most of the “intelligence” tests contain ve 


ry similar types of items. 
Aside from the basic differences associated 


with verbal and non-verbal 


Evaluating Aptitudes 319 


and performance and non-performance tests, the contents of intelli- 
gence tests commonly include several of the following types of test 
items: vocabulary, general information, verbal analogies, figure anal- 
ogies, same-opposites, completion, arithmetic reasoning, arithmetic 
computation, number series, identification of missing parts, spatial re- 
lationships, proverbs, verbal and pictorial absurdities, mazes, story and 
picture sequences, mixed words or sentences, and reading compre- 
hension. Illustrations of some of these types of items are shown in 
the following excerpts from a few well-known intelligence tests. 


(1) Vocabulary * 
What people say about a person constitutes his (?) 
1. character, 2. gossip, 3. reputation, 4. disposition, 5. per- 
"ool qc" € ( 


(2) Proverb * 
Which one of the six statements below tells the meaning of 
the following proverb? “The early bird catches the 
WOME suenccnaeeehaiasasRRERUARAESDKENEERET ( ) 
Don't do the impossible. 


. Weeping is bad for the eyes. 

Don't worry over troubles before they come. 

Early birds like worms best. 

Prompt persons often secure advantages over tardy 


ones. 
It is foolish to fret about things we can't help. 


9 sono 


(8) Arithmetic Reasoning * 
Which number in this series appears a second time nearest 


the beginning? 
645878095988654730891 


(4) Reading Comprehension ? 
The constitution would be the chief stumbling block in the 
way of fascists, but since, as students of the subject have 
shown, there can be no fascist triumphs without extreme 
illegalities, even to some violence, this would be the least 
of the fascist worries. Ignoring the constitution, they would 
encourage those drives which now are sending power up- 
ward to the federal government. Contrary to the desires of 
partisans of laissez-faire, fascism would necessarily result in 
speeding up the growth of bureaucracy. With the constitu- 


1 Otis Self-Administering Tests of Mental Ability, Higher Examination: Form A. 


, World Book Company, Yonkers-on-Hudson, N. Y., 1922. 
Ohio State Psychological Test, Form 23. August, 1947, Ohio College Assoc., 


Columbus, Ohio. 


320 


(5 


(6) 


Evaluating Major Objectives and Situations 


tional barriers down, the regulative powers of the govern- 
ment would be extended to an unheard-of extent. 


What adjective meaning “directive” is employed in the 
paragraph? 


l. extreme 2. upward 3. extended 4. regulative 5. 
encourage 
What synonym of “advocators” occurs in the paragraph? 
l. partisans 2. students 8. fascists 
5. illegalities 
What phrase of exactly two words in the paragraph states 
that the difficulties in the possible adoption of fascism in the 
U. S. have already been published? 
l. are sending 2. the constitution 3, have shown 4. ex- 
treme illegalities 5. laissez-faire 
What would be the least of the fascist worries? 
1. illegalities 2. constitution 3. extreme illegalities 4. 
violence 5, laws 
What one word in the 
would not be expected to 
to attain its ends? 
l. constitution 2. unheard- 
5. violence 


4. bureaucracy 


paragraph suggests that fascism 
go as far as a conquering invader 


of 8. bureaucracy 4. some 


Scrambled Words * 


Spatial Relations 3 


Select the two blocks which, when put together, will look 
the same as the first block on the line. 


Gokai. 


(7) 


8 Kuhlmann-Anderson Test D, Sixth Editi 


4 American Council on Education, Psychological Examination for College Fresh- 


Arithmetic + 


How many pencils can you buy for 50 cents at the rate of 
2 for 5 cents? 


(a) 10 (b) 20 (c) 25 (d) 100 (e) 125 


ion. Personnel Press, Princeton, N. J. 


men, 1948 Edition. Educational Testing Service. 


Evaluating Aptitudes 


(8) Figure Analogies * 


Look at the Figures A, B, and C in Sample 1 below. Figure A 
is a large circle. Figure B is a small circle. By what rule is 
Figure A changed to Figure B? The rule is "making it 
smaller." Now look at Figure C. It is a large square. What 
will it be if you change it by the same rule? A is to B as C 
is to: 


A B c 1 2 3 4 5 


‘QO o LI Bo [e] 


(9) Same-Opposite * 
In each of the following lines select the word that means 
the same as or the opposite of the word at the left. 


deep (1) blue (2) shallow (3) tense 
(4) watery 


awkward (1) clumsy (2) loyal (3) passive 
(4) young 


hot (1) dry (2) cooked (3) red 
(4) cold 


(10) Number Series * 


The numbers in each series below follow some rule. For 
each series you are to find the next number. 


Series Next Number 


1 24681012 10 11 12 18 14 
(a) (b) (c) (d) (e) 


(11) Verbal Analogies * 
In each row of words, the first two words form a pair. The 
third word can be combined with another word to form a 
Similar pair. Select the word which completes the second 
pair. 
sky-blue grass- (1) ground (2) sod (3) path 
(4) blue (5) green 
ice-solid water- (1) hard (2) fire (8) iron 
(4) liquid (5) boat 
* Ibid. 


321 


322 Evaluating Major Objectives and Situations 


(12) Missing Parts © 


The examinee is to 
complete the picture 
by indicating the miss- 
ing parts. 


Copyright by Psychologi- 
cal Corp., used by per- 
mission. 


(18) Picture Sequence 5 


Copyright by Psychological Corp., used by permission, 


The examinee is to arrange the pictures in a sequence so that they 
tell a sensible story. 


(14) Maze ° 


o-—> 


—> 


as d 


Courtesy, C. H. Stoelting Co., Chicago. 


The examinee is to begin at “S” and draw a path to show how he 
gets out of the maze, 


5 Wechsler Adult Intelligence Scale, Psychological Corp., N Y 
$ Porteus Test, Vineland Revision, Year VLC.H. Stoelting Co. Chicago. 


Evaluating Aptitudes 323 
(15) Picture Absurdities * 


The examinee is to point out an absurd 
fact shown in the picture. 


Teachers who are responsible for selecting standardized 
tests are advised to check the following references for such 
information as test titles and test criticisms: 


a) Ruch, G. M., and Segel, D., Minimum Essentials of the 
Individual Inventory in Guidance. United States Dept. 
of the Interior, Office of Education. Vocational Division 
Bulletin No. 202. (For sale by Superintendent of Docu- 
ments, Washington, D. C., Price 15 cents.) 

b) Froehlich, C. P., and Benson, A. L., Guidance Testing. 
Science Research Associates, Chicago, 1948. 

c) Buros, O. K., editor, The Fourth Mental Measurements 
Yearbook. Gryphon Press, Highland Park, N. J., 1953. 


PROGNOSTIC TESTS 

In addition to intelligence tests, prognostic tests in subject-matter 
areas may be included in the program of the high school. Some school 
personnel administer aptitude tests in algebra, geometry, foreign lan- 
guages, and business subjects in order to predict how well a student 
may perform. Students with low test scores are encouraged to take 
special substitute courses because it is believed that such students will 
fail in regular courses. The validity of such prognostic tests is generally 
low so that it is usually inadvisable to counsel a student to pursue re- 
stricted educational goals unless other significant information about 
the student is taken into consideration. In Table 20, a list of some of 


7 Stanford-Binet Intelligence Scale, Form M, Revised, Houghton Milin Co. 


324 


Evaluating Major Objectives and Situations 


the aptitude tests in the fields of mathematics, language, science, and 
business subjects is presented. 


Illustrative 
TABLE 20 


Subject-Matter Aptitude Tests 


TEST 
Algebra Prognosis Test 


Algebra Readiness Test 


Bennett Stenographic 
Aptitude Test 


California Algebra Aptitude 
Test 


Foreign Language 
Prognosis Test 


Iowa Algebra Aptitude Test 


Iowa Plane Geometry 
Aptitude Test 


Iowa Placement Examina- 
tions: Chemistry Aptitude 


Iowa Placement Examina- 
tions: Physics Aptitude 


Luria-Orleans Modern 
Language Prognosis Test 
Orleans Algebra Prognosis 
Test 

Orleans Geometry 
Prognosis Test 


Turse Shorthand Aptitude 
Test 


PUBLISHER 
C. A. Gregory 
Company 
Public School Pub- 
lishing Company 
Psychological Corp. 


Educational Test 
Bureau 


Bureau of Publica- 
tions, Teachers Col- 
lege, Columbia Univ. 
Bureau of Educa- 
tional Research and 
Service, University of 
Iowa 

Bureau of Educa- 
tional Research and 
Service, University 
of Iowa 

Bureau of Educa- 
tional Research and 
Service, University 
of Iowa 

Bureau of Educa- 
tional Research and 
Service, University 
of Iowa 

World Book 
Company 

World Book 
Company 

World Book 
Company 

World Book 
Company 


1942 


1944 


1944 


9-12 


12-13 


12-13 


sonnei M À— E 


Evaluating Aptitudes 325 


CLERICAL APTITUDE TESTS 


Clerical aptitude tests measure how much capacity a child or adult 
has for clerical details. The abilities involved are useful to clerks, 
typists, bookkeepers—in fact to anyone who needs accuracy and speed 
in doing paper work. The simplest or most streamlined versions of 
these tests present pairs of numbers and names separated by a short 
line. A check mark is entered on the line if the pairs are the same. No 
mark is entered if the pairs are different. Sample exercises are as fol- 
lows: 


984512... 34512 James Jones James Jones 

47306. . 47306 Sue Palmer. Sue Palmer 

89507. . 89705 Rochester, N. Y... Rochester, N. Y. 
1374928. 137498 Kenosha, Wisc. Kenosha, Wis. 

More elaborate versions of these tests include additional sections on 
alphabetical filing, classifying individuals according to residence, age, 
Occupation, etc., simple computations, spelling, and arithmetic prob- 
lems. 

Clerical aptitude tests have proved of value in differentiating be- 
tween good and poor clerks, typists, and stenographers, when they 
are used in conjunction with evidence obtained from other tests, from 
the past record of the individual, and from an interview. Individuals 
who obtain percentile scores of 75 or above (rate in the upper 25 per 
cent of the thousands who have taken the test) are good risks for the 
clerical aspects of an occupation. 


MANUAL DEXTERITY TESTS 


Manual dexterity tests have been devised to measure the ability to 
work rapidly and skillfully with the fingers, hands, and arms. All tests 
involve some degree of eye-hand coordination. The Minnesota Rate of 
Manipulation Test, for example, requires the individual to place sixty 
cylindrical blocks in sixty regularly arranged holes in a board. One 
Practice trial is given, then four trials are timed. A second part of 
the test requires the individual to pick up each block from its hole, 
turn it over, and replace it with the other hand. The hand functions 
are reversed after each row of fifteen blocks. This test is useful for 
Predicting success in such routine manipulative work as packing or 
wrapping boxes or cartons. 

The O’Connor Finger Dexterity Test requires the individual to 
Pick up three pins each time from a tray and insert them in small 
holes in a metal plate. The scores on this test aid in predicting success 
in such Occupations as assembling clocks or radio fixtures, which in- 


326 Evaluating Major Objectives and. Situations 


volve rapid handling of small objects. The reverse side of the metal 
plate is used for the tweezer dexterity test. The individual picks up 
one pin at a time with the tweezers and inserts it in a small hole. The 
score on this test predicts success in such occupations as drafting or 


watch repairing, which involve steadiness of hand and eye-hand co- 
ordination. 


MECHANICAL APTITUDE TESTS 


Several types of mechanical aptitude tests have been constructed 
including mechanical assembly and pencil-and-paper tests. The me- 
chanical assembly tests attempt to predict success by judging the 
competence of an individual to assemble from disarranged, separate 
parts such simple mechanical objects as a bicycle bell, metal pencil, 
and lock. Other performance tests involve spatial relationships, such 
as the Wiggly Block and cubes. 

The Wiggly Block is an oblong of wood cut into sections with a 
jigsaw. Its parts are disarranged, and the indi 
it together is used as a measure of his a 
tionships. The cube test consists of twen 
from a large cube of wood painted bla 
direction. The problem is to put the s 
as possible, with no p 


vidual’s speed at putting 
bility to discern spatial rela- 
ty-seven small cubes obtained 
ck and then cut twice in each 


mall cubes together as quickly 
ainted sides inside and no unpainted sides show- 
ing outside. This type of structural visualization is associated with 


such occupations as engineering, surgery, machinist trades, and others 
involving work with shapes and solids, 

Pencil-and-paper tests have been devised to measure spatial rela- 
tions by means of drawings of disarranged parts of a figure. The task 
is to select which one of five drawings indicates how the parts fit to- 
gether. Other tests include sections on tool recognition, application 


of tools to mechanical processes, arithmetic fundamentals, motor speed, 
and mechanical information. 


While both performance and pencil-and- 
ing success in mechanical occupations, th 
abilities required in complex mechanical 
information is required. Such information 
atic observation, record of school shop wi 

With few exceptions, the tests availa 
measuring mechanical aptitude lack su 
For this reason, the cruder the tests, th 
must the counselor or teacher be in ad: 
these test results. 


paper tests aid in predict- 
€y measure so few of the 
occupations that additional 
must be obtained by system- 
ork, and interview, 

ble at the present time for 
ficient proof of their validity. 
e more cautious and ingenious 
vising students on the basis of 


Evaluating Aptitudes 327 


The more prominent tests in this area may be grouped into the 
pencil-and-paper and performance types. Of the various pencil-and- 
paper mechanical aptitude tests, illustrations are presented below of 
exercises from three of the more widely used tests. 

The MacQuarrie Test for Mechanical Ability, published by the 
California Test Bureau, is used for junior- and senior-high-school 
students to measure such things as tracing, tapping, dotting, copying, 
etc. In the tracing test below the subject is asked to start at the 
little triangle and draw a curved line through the small openings in 
the vertical lines without touching them. 


RECORD TRACING 


Heng AE 


START 


Sub-test Score m 


In the Revised Minnesota Paper Form-Board Test the subject is 
asked to respond to the following type of question: First look at the 
two parts in the upper left-hand corner. You are to decide which 
figure shows how these parts can fit together. 


Copyright by Rensis Likert and William H. 
Quasha, published by Psychological Corp. 


328 Evaluating Major Objectives and. Situations 


The Tests of Mechanical Comprehension by Bennett and Fry ° are 
designed to measure the capacity of the subject to understand various 


types of physical relationships. One sample type of item included is 
as follows: 


Y 


Which would be the better shears 


for cutting metal? 


2 
? S) 


Copyright by Psychological Corp., used by permission. 


Performance tests are also avail 
of task such as handling tools 
boards, putting a nut on a bol 
are in terms of time 


able. All such tests involve some kind 
» inserting oddly cut blocks into form 
t, or inserting pegs into a board. Scores 
it takes to perform the task. (Typical of these 


tests are the pictures showing the Hand-Tool Dexterity Test equip- 
ment and the Purdue Pegboard. ) 


The Bennett Hand- 
Tool Dexterity Test 
measures proficiency 
in the use of wrenches 
and screwdrivers. The 
examinee takes apart 
the twelve fastenings 
in a prescribed se- 
quence and reassem- 
bles the nuts, washers, 
and bolts in the right- 
hand upright. The 


time required is the 
score. 


e ai aedhanial Comprehension, Form AA, Bennett, G; y: Psychological 
Corporation, 1940. > 


Evaluating Aptitudes 329 


The Purdue Pegboard 
is an apparatus test 
measuring manipula- 
tive dexterity in (a) 
gross movement of 
hands, fingers, and 
arms, and (b) tip-of- 
finger small assembly 


work. 


Science Research Associates. 


Table 21 presents general information about each of the more gen- 
erally known tests of mechanical aptitude. 


Illustrative 
TABLE 21 
Mechanical Aptitude Tests 
TEST TITLE PUBLISHER DATE LEVEL 
Bennett Hand-Tool Psychological Corp. 1946 Adults 
Dexterity Test 
MacQuarrie Test for California Test 1925 Ages 
Mechanical Ability Bureau (1943 16 and 
i Manual) Over 
Minnesota Spatial Educational Test 1930 Ages 
Relations Test Bureau 11 and 
Over 
O'Connor Finger- C. H. Stoelting & 1996 Age 
Dexterity Test Co. 15-Adults 
O'Connor Tweezer- C. H. Stoelting & 1926 Age 
Dexterity Test Co. 15-Adults 
Pennsylvania Bi-Manual Educational Test 1945 Ages 
Worksample Bureau 17 and 
Over 
Prognostic Test of California Test 19046 Grades 
Mechanical Abilities Bureau 7-12, 
Adults 
Purdue Pegboard Science Research 1941 Grades 
Associates 9-12, 
Adults 
Revised Minnesota Paper Psychological Corp. 1941 Ages 
Form-Board Test 9 and 


Over 


330 Evaluating Major Objectives and. Situations 


TEST TITLE PUBLISHER DATE LEVEL 
SRA Mechanical Science Research 1947 rox" 

i Associates —12, 
Aptitudes pre A 
Small Parts Dexterity Psychological Corp. 1946 H.S. and 
Test Adults 
Tests of Mechanical Psychological Corp. 1947 Grades 
Comprehension 2 and 

ver 


————————————D 
MUSICAL APTITUDE TESTS 


Most tests of musical aptitude are based on the psychological analy- 
sis of musical talent by Seashore, who constructed the first measures 
of musical talent. Currently, there are two types of presentation of the 
test exercises to the child or adult: (a) phonograph recordings and 
(b) playing of the standard exercises on a piano by an examiner. 

Regardless of the mechanics of presentation, a series of test exercises 
may measure such aspects of musicianship as the following: (a) sense 
of pitch, (b) sense of intensity, (c) sense of time, (d) tonal memory, 
(e) sense of rhythm, and (f) sense of timbre. As the test exercises are 
played, the child or adult records his answer on a special answer sheet. 

The test on sense of pitch, for example, presents a number of paired 
sounds and requires the individual to indicate whether the second 
sound is higher or lower in pitch than the first. Similar methods of 
presentation and responding are used for other parts of the test. 


ART APTITUDE TESTS 


Attempts to measure art aptitude vary from art judgment to draw- 
ing an original design. The Meier Art Judgment Test presents pairs 
of pictures. One of each pair is the reproduction of an artistic work 
of recognized worth; the other of each pair has been altered in some 
way to lower its artistic qualities. The indiv 
the more artistic picture in each pair. 

The McAdory Art Test presents four variations of a single theme— 
pictures of furniture and utensils, texture and clothing, architecture, 
shape and line arrangement, dark and light masses, and color. The 
individual ranks each of the four variations in order of merit. 

Other tests require such exercises as drawing a design from memory, 
creating and completing designs from supplied elements, recognition 
of proportions, originality of line drawing, analysis of problems in per- 
spective, and recognition of color. 


idual is required to select 


Evaluating Aptitudes 331 


The child who achieves a high percentile score on an art aptitude 
test has potential art ability. Additional evidence from observation of 
performance in art situations is essential to arrive at reasonable judg- 
ments of the capacity of the individual. 

Table 22 lists selected art and music tests, their publishers, date of 
publication, and age or grade level. 


Illustrative Art and. Music 


TABLE 22 
Aptitude Tests 
TEST TITLE PUBLISHER DATE LEVEL 
Drake Musical Public School Pub. 1984 Grade 
Memory Test Co. 4 and 
Over 
Graves Design Psychological Corp. 1948 Grades 
Judgment Test 7-16, 
oe Adults 
Hom Art Aptitude C. H. Stoelting Co. 1951 Grades 
Inventory 12-16, 
Adults 
Knauber Art Ability Author (Alma J. 1935 Grades 
Test Knauber) 7-16 
Lewerenz Test in California Test 1927 Grades 
Fundamental Abilities Bureau 8-12 
of Visual Arts 
McAdory Art Test Bureau of 1929 Grades 
Publications, 1-16, 
Teachers College, Adults 
Columbia Univ. 
Meier Art Tests: I. Art Bureau of Educa- 1940 Grades 
Judgment tional Research and 7-12 
Service 
Musical Aptitude Test California Test 1950 Grades 
Bureau 4-10 
Seashore Measures of Psychological Corp. 1939 Grades 
Musical Talents, 5-8, 
Adults 


Rey. Ed. 


wv Ed. O sss 


APTITUDE TESTS FOR SPECIFIC 
d colleges have employed admissions, or apti- 


In recent years selecte à : 
nts for such subjects as medi- 


tude, tests to choose prospective stude 


OCCUPATIONS 


332 Evaluating Major Objectives and. Situations 


cine, law, engineering, and nursing. Each year a new form of the 
Medical Aptitude Test is constructed for the Association of American 
Medical Colleges. Among the abilities measured are: comprehension 
and retention, visual memory, memory for content, logical reasoning, 
scientific vocabulary, and understanding of printed material. 
In the past, law aptitude tests have included sections on: capacity 
for accurate recall, comprehension and reasoning by analysis, symbolic 
logic, and comprehension of difficult reading. Engineering aptitude 
tests stress mathematical ability, spatial perception, and general scho- 
lastic aptitude. Nursing aptitude tests usually include subtests on scien- 
tific vocabulary, general information, visual memory, memory for 


content, understanding of printed materials, and ability to understand 
and follow directions. 


MULTIPLE APTITUDE TEST BATTERIES 


The factor analysis studies of many tests have resulted in the identi- 
fication of some relatively independent abilities and skills that are 
assumed to be valuable in predicting success in various tasks or occu- 
pations. The need for additional research in establishing the predictive 
validity of the differential aptitude tests is apparent. The magnitude 
and the complexity of the job of validation are so great that psycholo- 
gists have made but modest advances in this task. 

Test batteries for multiple, or differential, aptitudes include the 
Primary Mental Abilities Tests, Differential Aptitude Tests, and Flana- 
gan Aptitude Classification Tests. The General Aptitude Test Battery 
of the United States Employment Service is of the same type, but it is 


not available for use outside the USES offices. A brief description of 
each of the batteries is provided below. 


Primary Mental Abilities Tests were constructed by Thurstone as a 


). These tests are published by 


The PMA for ages 7-11 provides measur: 
of intelligence as follows: V 


The PMA for ages 11-17 differs from the ages 7-11 battery in that 


Evaluating Aptitudes 333 


Perception is omitted and Word Fluency is added for the older age 
group. 

Differential Aptitude Tests were constructed by Bennett, Seashore, 
and Wesman. They are published by the Psychological Corporation. 
The authors continue to conduct studies of the predictive validity of 
the tests for use in educational and vocational guidance at the junior- 
and senior-high-school levels. The Differential Aptitude Tests provide 
eight scores: Verbal Reasoning, Numerical Ability, Abstract Reason- 
ing, Space Relations, Mechanical Reasoning, Clerical Speed and Ac- 
curacy, Spelling, and Sentences ( Grammatical Usage). 

The 1952 edition of the Manual for the Differential Aptitude Tests 
includes not only instructions for administering, scoring, and inter- 
preting the test results, but also a compact summary of numerous 
validation researches completed since the publication of the battery 
in 1947. Correlative Publications, such as Counseling from Profiles: A 
Casebook for the D.A.T., are helpful in the interpretation of test re- 
sults. 

Flanagan Aptitude Classification Tests were constructed by Flana- 
gan in an effort to establish a standard classification system for de- 
Scribing those aptitudes that are important for successful performance 
of particular occupational tasks. The battery is published by Science 
Research Associates and consists of the following tests: 


1. Inspection—ability to spot quickly flaws or imperfections in a 

series of articles 

Coding-speed and accuracy of coding typical office informa- 

tion 

Memory-ability to remember the codes learned in Test 2 

Precision—speed and accuracy in circular finger and hand 

movements 

Assembly—ability to visualize the appearance of an object 

from separate parts 

6. Scales-speed and accuracy in reading scales, graphs, and 
charts 

7. Coordination—ability to coordinate hand and arm movements 

8. Judgment and Comprehension—ability to read with under- 
standing, to reason logically, and to use judgment 

9. Arithmetic—skill in working with numbers: adding, subtract- 
ing, multiplying, and dividing 

10. Patterns—ability to reproduce simple pattern outlines accu- 


rately 
11. Components—ability to identify important component parts in 


line drawings and blueprint sketches 
12. Tables—ability to read tables of numbers and tables of words 


and letters 


a RO P 


334 Evaluating Major Objectives and. Situations 


18. Mechanics—ability to understand mechanical principles and 
to analyze mechanical movements 
14. Expression—knowledge of correct English 


In the Counselor’s Booklet for the Flanagan Aptitude Classification 
Tests (FACT), job descriptions for thirty occupations are presented 
and specific tests are recommended for each occupation. Thus, for the 
job classifications of Accountant, the following Critical FACT Ele- 
ments are recommended: Test 2—Coding, Test 3-Memory, Test 8— 
Judgment and Comprehension, Test 9—Arithmetic, and Test 19— 
Tables. 

Some follow-up studies have been made of graduates of Pittsburgh 
high schools to whom some of the tests were administered. More care- 
ful study and research are required to establish the predictive validity 
of the FACT for various occupational classifications. This series of 
tests was published in January, 1954. Investigators will gradually add 
to the fund of information about the value of these tests. 

General Aptitude Test Battery of the United States Employment 
Service is not generally available for use, but the fifteen tests which 
comprise the GATB measure the following aptitudes which contribute 
to occupational success: G—intelligence, V—verbal ability, N—numer- 
ical ability, S—spatial ability, P—form perception, Q—clerical percep- 
tion, A—aiming, T—motor speed, F—finger dexterity, and M—manual 
dexterity. There are eleven paper-and-pencil and four apparatus tests. 
Occupational Aptitude Patterns have been established for twenty 
fields of work, representing two thousand occupations, setting critical 
scores which will eliminate the lowest third of the standard scores of 
the employed sample. An Individual Aptitude Profile is prepared from 


ared with the twenty Occupational 


e fields of work that are t suit- 
able for the person's abilities. MSS 


Summary 


Aptitudes may be defined as ca 


: pacities and abilities for a given line 
of endeavor, such as a particular 


art, school Subject, or vocation. Evi- 
dence from current research studies Supports the theory that apti- 


tudes are both innate and acquired; are pluralistic rather tian vni 
tary in character; are more constant than variable for an maidh ^ 
tend to fit the normal frequency distribution curve when a | ividual; 
selected population is measured on specific aptitudes and 2 un- 
It is the function of the school to identify aptitudes ds sis Bo 

> 


Evaluating Aptitudes 335 


the development of talents and abilities, and to provide educational 
and vocational guidance so that educational opportunities may be 
adapted to individual abilities and needs. In the evaluation of apti- 
tudes, both specialists and classroom teachers should be involved in 
the appraisal made. The time at which aptitudes should be evaluated 
is determined by when such information will be most useful in the 
guidance process. Aptitudes should be evaluated by means of tests, 
supplemented by such other methods as interview, questionnaire, ob- 
servation and anecdotal record. 

The types of aptitude tests discussed in this chapter include intelli- 
gence, prognosis for ability in school subjects, clerical aptitude, man- 
ual dexterity, mechanical aptitude, musical aptitude, art aptitude, apti- 
tude for specific vocations, and multiple aptitude test batteries. In- 
telligence tests, both individual and group, are used to measure gen- 
eral intelligence or academic ability for success in the usual school 
subjects. Prognostic tests usually comprise introductory skills related 
to such school subjects as algebra, chemistry, physics, Latin, and 
French. 

Aptitude tests are used with increasing frequency as one among 
several types of evidence that aid in the selection or guidance of indi- 
viduals for special occupations. So complex are the mental, physical, 
and emotional characteristics which contribute to success in an educa- 
tional career or in an occupation that only selected characteristics can 
be measured by aptitude tests. Other types of evidence may be sup- 
plemented by aptitude tests in reaching decisions. For example, in- 
dividuals achieving high scores on scholastic aptitude tests are good 
risks for academic success in high school and college. Likewise, indi- 
viduals achieving high scores on music and art aptitude tests are good 
risks for success in their respective fields. Clerical aptitude tests have 
value in predicting success in clerical work, typing, and stenography. 

Manual dexterity tests provide important information about the 
ability of an individual to work rapidly and skillfully with the fingers, 
hands, and arms, as well as eye-hand coordination, for aspects of an 
occupation requiring these abilities. In a like manner, mechanical apti- 
tude tests provide valuable supplementary data on those aspects of 
an occupation requiring ability to visualize space relations and ability 
to recognize and utilize tools and mechanical processes. 

Professional schools for medicine, law, engineering, and nursing are 
increasingly using aptitude test batteries to select applicants who are 
most likely to succeed in achieving the courses of study. However, 
additional evidence based on the previous school record of the indi- 


336 Evaluating Major Objectives and Situations 


vidual is employed to assess scholastic success, attitudes, and per- 


sonality. 


Problems for Class Discussion 


l. For a class, compare the intelligence (academic aptitude) test scores 
with pupil achievement in academic subjects. How well does the aca- 
demic aptitude test predict achievement? 

2. Indicate how aptitude tests might be used in the educational and voca- 
tional guidance of several teen-age youths whom you know. 

3. To an increasing extent, tests are being used to select personnel in large 
industrial or business firms. Choose three different positions which might 


be found in any large firm and draw up a testing program for selecting 
the best applicant for each position. 


References Cited in This Chapter 


l. Buros, O. K., editor, The Nineteen Forty Mental Measurements Year- 
book. Highland Park, N. J.: The Mental Measurements Yearbook, 1941. 

2. Buros, O. K., editor, The Third Mental Measurements Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949, 

8. Buros, O. K., editor, The Fourth Mental Measurements Yearbook. High- 
land Park, N. J.: The Gryphon Press, 1953. 

4. Good, C. V., editor, Dictionary 
Book Co., 1945. 

5. Guilford, J. P., editor, “Printed Classific. 
Psychology Report No. 5, W. 
Office, 1947. 

6. Hull, C. L., Aptitude Testing. Yonkers, N. Y.: 


World Book Co., 1998. 
7. Kelley, T. L., Crossroads in the Mind of Man. Palo Alto: Stanford 
University Press, 1998, 


8. Shartle, C. L., et al., “Ten Years o! 
Occupations, 22:431—441, 1944, 


9. Super, D. E., Appraisin, 


of Education. New York: McGraw-Hill 


‘ation Tests,” AAF Aviation 
ashington, D. C.: Government Printing 


f Occupational Research, 1934-1944," 


& Vocational Fitness, New York: H 


ar & 
Brothers, 1949. rper 
10. Thurstone, L. L., The Vectors of Mind. Chicago: Uni ; , 
Press, 1935. £0: University of Chicago 
11. Warren, H. C., Dictionary of Psychology. Boston: Houghton Mifflin 
1934. » 


References for Further Reading 


Super, D. E., Appraising Vocational Fitness. New York: Harper & Brothers 
1949. i 


The most complete, detailed, and 
tude tests, their uses, interpretation 


Evaluating Aptitudes 337 


research findings. This book should be consulted by persons who wish 
detailed and authoritative information about specific aptitude tests. 


Thorndike, R. L., Personnel Selection: Tests and Measurement Techniques. 
New York: John Wiley and Sons, 1949. 

Although the purpose of this book is broader in scope than the pur- 
pose of the Super book, it contains comprehensive and authoritative 
information about aptitude tests and related procedures in personnel 
selection. 


Evaluating 
Personal-Social Adjustment 


CHAPTER EIGHTEEN 


In recent years education and related fields have em- 
phasized the need for concern about the.development of well-ad- 
justed individuals. Studies have shown that personal and social ade- 
quacy is just as important for successful adaptation to life and work 
as academic and vocational skills. In fact, more job failures are caused 
by maladjustment than by lack of skill. In schools, teachers have rec- 
ognized that failure to learn has often been due to emotional and so- 
cial difficulties. Psychiatric and psychological studies show that many 
maladjustments have their roots in early childhood. One cannot but 
be impressed with the importance of the educational goal of whole- 
some adjustment, both as a direct help to children in school and as a 
major preventive force against later mental breakdown and malad- 
justment. Certainly schools cannot do the whole job, but they can 
help considerably. 

When academic achievement is the primary educational aim, re- 
ports to parents and others consist mainly of measures of academic 
achievement—either standardized test results or teacher ratings. In a 
program which stresses emotional and social development as well, 
additional records and methods of appraisal are introduced. These rec- 
ords are concerned with the children's thinking and feeling, interests, 
and personal and social adjustment. Because these aspects of child 
growth cannot be measured as readily by objective means as can 
physical and intellectual aspects, the records are more subjective. 


However, whatever objectivity is lost is compensated for by the liv- 
ing picture of growth which such records afford. 


CONCEPTIONS AND DEFINITIONS OF 
PERSONALITY AND ADJUSTMENT 


One of the major problems confronting workers in this area lies in 
the variety of conceptions and definitions of personality and adjust- 
338 


Evaluating Personal-Social Adjustment 339 


ment. There is at present disagreement, as Olson (6) points out, about 
the nature of personality and adjustment. This failure to arrive at a 
common definition of personality and personal-social adjustment de- 
rives from two conflicting schools of thought. One conceives of per- 
sonality as composed of a great number of discrete elements called 
attributes or traits. The other views personality as a totality or “Ge- 
stalt,” holding that the individual reacts in terms of one integrated 
pattern which is reflected in every response. Growing out of these con- 
ceptions of personality are two views of personal-social adjustment. It 
is clear that the way one conceives of personality will have a bearing 
on one's conception of the relation of a particular personality to other 
personalities or to social reality. The nature of this relationship is 
what is commonly understood as adjustment. 

The advocates of the trait version of personality are apt to speak of 
adjustment in many ways. They may emphasize adjustment to the 
home, to the school, or to the community. The workers influenced by 
the "Gestalt" view of personality focus on the total personal-social ad- 
justment of the individual. They believe that the over-all response of 
an individual to situations or people will be similar in quality or 
deeply interrelated in varying situations. Simply, they say that there 
may be a tendency toward adequate or inadequate personal-social 
adjustment which will always be evident. 


DIFFICULTIES IN EVALUATING ADJUSTMENT 


The growing emphasis in the schools on personal-social adjustment, 
as well as on personal and social growth, has led to a deepening con- 
cern with measuring and evaluating processes of adjustment in these 
areas. It is clear that when a school includes the development of bet- 
ter personal and social relationships as one of its functions, it must 
learn whether it is fulfilling its objectives in this area. The evaluation 
of this growth, however, poses many problems of which the teacher 
needs to be aware. This awareness of the difficulties should not deter 
anyone from this important undertaking, but simply serve as a warn- 
ing against drawing easy and quick conclusions. 

Il be discussed yield data regarding per- 
The interpretation of that data, how- 
mounts of specialized knowl- 


Many techniques which wi 
sonal and social relationships. 
ever, is a task which requires varying à 
edge. To determine the meaning of a particular behavior requires 
some understanding of the dynamics of behavior in general Some- 
times, a background in the general principles of psychology and men- 
tal hygiene may help the teacher interpret the acts of a specific child. 


340 Evaluating Major Objectives and Situations 


Rich experience with children may aid in clarifying meanings for the 
on yielded by projective techniques are an illustration of this 
point. Though projective techniques vary in the complexity of admin- 
istration, nearly all of them require considerable training and special- 
ized skill for interpretation. Other techniques depend for their efficacy 
on the interpersonal relationship established between teacher and 
pupil, or between the pupil tested and the individual administering 
the test. The nature of the rapport which is established determines 
the frankness and sincerity with which the pupil will answer certain 
personal and intimate questions. Without reliance on the veracity of 
the statements made by the subject, many methods of evaluating per- 
sonal-social adjustment lose in validity. 

In the evaluation of personal-social adjustment much more time is 
consumed than in the evaluation of achievement or intelligence. Many 
methods consist in long-range observations or in ongoing descriptions 
of behavior. Others require repeated testing or administering of in- 
ventories. Processes of adjustment are processes in time. There is no 
short cut to their evaluation. 

The material may be interpreted as it accumulates, or interpreta- 
tion may be postponed until after many forms of evidence are in and 
the picture of the whole individual is beginning to emerge. Despite 
the formulation of general interpretative principles it is important to 
recognize that to a certain extent every individual is unique. General 
principles have their limits in application to the individual. On the 
other hand, interpretation m 


ay be too subjective. The teacher may 
come to conclusions about behavior which have little reference to the 


behavior and relations of children in general. In the present state of 


evaluation in this area, either pitfall is sometimes unavoidable. It is 
desirable, however, to strike a balance between the two. 


SURVEY OF TECHNIQUES FOR ASSESSING 
PERSONAL-SOCIAL ADJUSTMENT 


A variety of techniques are at present employed to appraise per- 
sonal-social adjustment. This chapter briefly discusses some of the 
most widely used and most promising of these techniques. A more 
comprehensive statement describing, illustrating, and discussing each 
of them is contained in another section of this volume. In this chapter 
an overview of techniques is discussed under the following headings: 
(a) self-descriptive inventories or personal reports, (b 


) rating scales 
of personal and social conduct, (c) observational a 


nd anecdotal rec- 


Evaluating Personal-Social Adjustment 341 


ords, (d) free association and projective methods, (e) autobiogra- 
phies, (£) interviews, (g) sociometric techniques, and (h) situational 
tests. Some of the categories of appraisal methods comprise a variety 
of techniques and variations of the same technique. 


(a) Self-descriptive Inventories or Personal Reports 


In the personal report type of personality test the subject evaluates 
himself. The tests or inventories are standardized questionnaires in 
which the individual is asked questions about how he reacts or how 
he feels in various situations. The individual rates himself in a variety 
of areas and describes his likes, dislikes, fears, and problems. 

Self-descriptive inventories measure certain aspects of personal- 
social adjustment such as neurotic tendency, self-sufficiency, and intro- 
version-extroversion. A numerical score is obtained for an individual's 
self-rating in each of these areas. Certain inventories evaluate adjust- 
ment in specific life situations by asking questions pertinent to them. 
The Bell Adjustment Inventory, for instance, provides scores for home 
adjustment, health adjustment, social adjustment, and emotional ad- 
justment. 

The advantages of these tests are many. They are easily adminis- 
tered and scores are obtained by following simple instructions. For 
each age group, norms have been established and the teacher may 
evaluate the scores of a particular child with reference to these norms. 
Problem areas may be revealed, though remedial action can be un- 
dertaken wisely only when the causes and dynamics of behavior are 
known. The tests may be repeated to gauge the extent of progress of 
adjustment for an individual child or for a group of children. The ad- 
ministration of the inventory may further increase self-awareness and 
Self-criticism on the part of the pupils who are describing their 
problems. 

Despite their wide use, tests of adjustment taking the personal re- 
port form have grave limitations. The individual reports his problems 
and his behavior in verbal terms only. The investigator is forced to 
depend upon the frankness and the sincerity of the subject. Fre- 
quently, the individual tested will not answer truthfully for a variety 
of reasons. He may want to conceal certain areas of his personality, 
or he may wish to please the investigator by answering in a particu- 
lar way. In addition, many individuals have areas of conflict of which 
they are not aware, and these may not be revealed in the inventory. 
Personality tests utilizing questionnaires are further open to the criti- 
cism that they are based on a theory of personality which is as yet 


342 Evaluating Major Objectives and Situations 


unsubstantiated. These tests generally measure personality traits, or 
aspects of personality, and scores are reported for these traits or as- 
pects. But this trait theory of personality and adjustment is vehe- 
mently opposed by workers in the areas of Gestalt psychology and 
psychoanalysis. 

These limitations in no way abrogate the usefulness of these tests 
as diagnostic and exploratory tools in the emerging area of the evalua- 
tion of personal-social adjustment. For a detailed description of some 
of the inventories used, a discussion of the aspects of personality as- 
sessed by them, and the types of tests applicable to various age 


groups, see Chapter Ten on Personal Reports and Projective Tech- 
niques. 


(b) Rating Scales of Personal and Social Conduct 


Rating scales are used by teachers, supervisors, and parents to eval- 
uate the personal and social conduct of pupils. These devices assess 
the adjustment of an individual child in terms of specified aspects of 
behavior. Children may be rated on cheating, lying, defiance to disci- 
pline, mental alertness, cooperativeness, and many other characteris- 
tics which reveal themselves in the behavior of the child. Rating scales 
are useful only if the rater is acquainted with the individual to be 
rated. The more opportunities the rater has for observing the indi- 
vidual, the higher the validity of the rating will be, Since subjective 
elements necessarily enter into the ratings, the reliability 
of rating scales is increased if the ratings of sever 
bined. 

There are three types of rating scales which are commonly used: 
(a) descriptive rating scales, in which the rater chooses one of several 
descriptive phrases regarding a behavior form or trait, (b) numerical 
rating scales, in which numbers are assigned to every trait rated, and 
(c) graphic rating scales, which consist of a line continuum vide 
which descriptive phrases are printed horizontally at various oints 

An example of one type of graphic rating scale is giy p i 


and validity 
al judges are com- 


en below: 
Is this 
pupil de- 
pendable? | | | | | 
Depend- Usually S E : 
able in all situ- depend- Wiad. bes n dee A 
ations able able nd: po e 


able able 


Evaluating Personal-Social Adjustment 343 


Table 23 lists some of the rating scales in use today for the ap- 
praisal of personal and social conduct, and indicates the grade for 
which they are intended and the aspects of behavior they purport to 
measure. 


Illustrative 
TABLE 23 
Rating Scales of Behavior and Personality 
ASPECTS OF BEHAVIOR 

TEST AND PUBLISHER MEASURED LEVEL 
Haggerty-Olson-Wickman Cheating, lying, temper Elementary 
Behavior Rating Schedule outbursts, speech difficul- grades 
(World Book Co.) ties, imaginative lying, etc. 
Winnetka Scale for Rating Cooperation, social con- Nursery 
School Behavior and sciousness, emotional secu- school 
Attitudes rity, leadership, etc. through 
(Winnetka Educational grade 6 
Press) 
BEC Personality Rating Mental alertness, initiative, Grades 7 
Schedule dependability, cooperative- and above 
(Harvard Univ. Press) ness, judgment, etc. p 
Vineland Social Varying degrees of social Age 1 to 
Maturity Scale maturity 25 years 
(Educational Test 
Bureau) 


a 


Rating scales are useful instruments for the evaluation of adjust- 
ment if the teacher understands the nature of the device and the er- 
rors possible in its application. One of the dangers inherent in rating, 
aside from prejudice and lack of acquaintance with the person or 
persons rated, is the “halo effect.” If the teacher rates a given pupil 
on all the attributes listed on the scale at one time, the rating given 
for one trait may influence the rating made on the next. There may 
be a tendency to rate one pupil “high” all the way through and an- 
other “low.” A detailed discussion of rating methods, their variety, 
advantages, and disadvantages is contained in Chapter Nine. 


(c) Observational and Anecdotal Records 


Techniques of observation and anecdotal recording have come to 
play an increasingly important role in the field of evaluation. There is 
no substitute for observation of the actual behavior of individuals over 


344 Evaluating Major Objectives and Situations 


a period of time and the recording of significant actions and interac- 
tions. There are many forms of controlled observation which have 
demonstrated their usefulness. The most rigidly controlled of these 
consists in observing a particular activity or a particular child at regu- 
lar intervals for a short or longer period of time. Observational data 
may be obtained in two ways. One procedure consists of having pre- 
defined behavior categories which have been developed through ex- 
ploratory work. Every time a given behavior occurs which is described 
in one of the categories, the appropriate code is entered. The other 
procedure of observation consists of a running account of everything 
which seems significant to the observer at the time of observation. 
This is more flexible and permits a greater variety of observation. It, 
however, falls short of the criteria of scientific adequacy, since the ob- 
server's role looms large in this method. Direct observation has yielded 
promising results when used as a research tool. 

Teachers have used observation in the form of anecdotal records. 
The teacher records incidents in the life of the pupil which seem sig- 
nificant as a revelation of his emotional and social adjustment. The 
formulation of the criteria of significance may be left up to the teacher 
or grow out of staff conferences. Anecdotes should be objective de- 
scriptions of an event or a situation which highlights the adjustment 
problem of the child. They need to be written without interpretation, 
if possible. If these records are kept over periods of time by several 
teachers for one child, a picture of the personal-social adjustment of the 
child will emerge. The summarization and interpretation of the rec- 
ords can be made a joint enterprise in which severa] teachers, the 


direct observation 
d. Teachers often 


scribe behavior without superimposing an interpret 
scription. Because of the pressure of other duties, 
not be able to make time for recording observati 
ords may be kept only for those students who s 
make themselves noticed by the teacher. Frequ 
verted child will escape notice, though it is this child who may pre- 
sent a severe problem of social and Personal adjustment, du 
Chapter Seven on observational methods and anecdotal records sur- 
veys the problems connected with the use of these procedures and 


gives references to literature dealing with this important area of 
evaluation. 


ation upon the de- 


Ons. Often, the rec- 
omehow are able to 
ently the shy, intro- 


Evaluating Personal-Social Adjustment 345 


(d) Free Association and Projective Methods 


Free association techniques and projective methods represent global 
methods of evaluation (2). They endeavor to appraise the total per- 
sonality configuration instead of isolated attributes or traits. The 
theory behind free association and projective techniques stresses the 
totality and integration of personality. It holds that in its response to 
environmental stimuli, the personality is apt to reveal its characteristic 
mode of reacting. To a certain extent, any response to a controlled 
stimulus fulfilling certain requirements will uncover the dominant 
personality constellation of the responding individual. The require- 
ments for the experimental stimulus are simply that it be relatively 
ambiguous or unstructured, that it have no definite meaning or struc- 
ture, 

In a free association test, as first formulated by the Swiss psychiatrist 
Jung and later adapted by the Americans, Kent and Rosanoff, the in- 
dividual is asked to respond to a series of stimulus words. He is given 
à stimulus word and asked to say the first word which comes into 
his mind after hearing the stimulus word. In analyzing the response 
patterns two elements are emphasized, the length of time it takes an 
individual to free-associate and the kinds of answers which he gives. 
The length of time for association indicates the presence or absence 
of a “block” to certain ideas. The types of responses given are scored 
by checking them against lists which represent the response patterns 
of specific groups such as neurotic and normal. Later variations of free 
association techniques include sentence completion (7) and picture 
association methods (8). 

Most projective methods use more complex and less structured 
stimuli than the free association test. Stimuli used are chosen to permit 
the full projection, expression, and externalization of the inner struc- 
ture of personality. These methods do not create awareness on the 
Part of the individual that he is revealing his needs, problems, in- 
terests, and complexes, since he is not asked to give a personal report 
or describe himself. A great variety of techniques have recently been 
developed which make use of the diagnostic potentialities of projec- 


tion and expression. Some techniques have had a previous history and 


have now reappeared within this new framework. Scientific validation 


for most of them is proceeding slowly, and a well-founded appraisal 
of their value and limitations is not yet available. 

The most widely used of the projective methods is the Rorschach 
ink blot test (2), previously described in Chapter Ten. The stimulus 
for the individual is a series of ten ink blots. There are characteristic 


346 Evaluating Major Objectives and. Situations 


differences between individuals and groups of individuals in the 
manner in which they perceive the ink blots. The ink blot can be 
interpreted in many ways. Every interpretation imposes a structure 
upon the blot which reveals the dominant personality characteristics 
of the subject. In arriving at an intepretation of the shape represented 
by the blot, the subject is apt to use various aspects of the blot as 
determinants of perception. These determinants (form, color, shading, 
etc.) are the key to a scoring system which classifies the individual 
on the basis of the determinants used. 

The Thematic Apperception Test, also discussed in Chapter Ten, 
and other pictorial projective methods (2) use unstructured pictures 
representing human beings in some form of interaction as the stimu- 
lus material. The subject is instructed to tell a story about the people 
in the picture, and is told that his fantasy or creativity is being 
tested. The stories developed around the pictures are used to uncover 
the subject's needs, interests, and problems. Though the raw material 
for the fantasy derives from many sources, the personal and sub- 
jective component has purportedly been percolated out in systems of 
Scoring and interpretation. 

The term “projective methods" has come to be applied to a variety 
of other techniques which are more expressive than projective. Among 
those are techniques utilizing drawing, painting, and handwriting. 

Drawing and painting techniques (1) are developed both around 
spontaneous art activity of children and around specific tasks. Differ- 
ences in the drawing of a man and a woman have been used for diag- 
nostic purposes. The conception of the drawing, the maturity of 
execution, the size, and many other factors have been systematically 
investigated to derive generalizations which are then applicable to 
individual children. Handwriting analysis has emphasized the motor 
as well as psychological component in handwriting. Particularly for 
small children, play techniques have proved interesting and useful 
Pons ecl pile tyke spear his conception of 
Systematic play techniques ib es "ere tea and owe 
quantifiable data. sues Dei. Hash: dle 


The main difficulties connected with the use of projective techniques 
lie in the training and skill needed for administration and interpreta- 
tion of some of them. Even the use of drawing and pla iss ues 
may be limited for most teachers, since inepto of pes is 
often made in terms of dynamics of behavior with which they may be 
unfamiliar. Material yielded by these techniques can mc rud be 


Evaluating Personal-Social Adjustment 347 


used for guidance if analysis by competent specialists is available. 
Experimentation with some of the less complicated techniques may 
frequently yield valuable clues and point up specific points of concern. 

Current agreement seems to be that projective methods do not re- 
veal all facets of personality as some proponents of them have be- 
lieved. They do, however, reveal aspects of personality not accessible 
to other techniques of evaluation. The material which is provided by 
projective techniques may serve as a point of departure for further 
investigation. If possible, other techniques should be used as corrob- 
orative evidence for the insights gained. 


(e) Autobiographies 

The autobiography as a device of self-description has a long his- 
tory. Many insights about some of the literary and scientific geniuses 
of former times have been obtained from a careful perusal of their 
autobiographies. Compositions of the autobiographical type have been 
used as a method for revealing some of the salient problems of stu- 
dents to the teacher and counselor. Many teachers have at one time 
or another asked their students to write on the topic “My Happiest 
Moment” or “Things I Like to Do Best (or Least).” If the student has 
confidence in the teacher, has some ability to express his feelings and 
problems, and does not wish to hide certain facts out of fear of punish- 
ment, some of the material which is obtained has great value for 
proper guidance and adjustment (3). If the above factors are not 
present the usefulness of the autobiographic document is limited. The 
advantages and limitations of the autobiography are discussed more 
fully in Chapter Ten. 


(f) Interviews 


The interview is a method for obtaining data by a face to face 
conference in which the student tells the interviewer his story of the 
problem under discussion. The interviewer may employ controlled 
observation and rating techniques as a part of the interview. In the 
interview the evaluation is not standardized and the interviewer must 
make an interpretation from related or isolated answers. The interview 
is a flexible instrument, permitting free expression on the part of the 
subject as well as the pursuing of salient points by the interviewer. 
Aside from its use in the area of personality assessment, the interview 
has played an important role in the area of attitude research and 
public opinion polling. In the non-directive therapy school developed 
by Rogers the counseling interview has not only a diagnostic function 


348 Evaluating Major Objectives and. Situations 


but a therapeutic one as well. It is used to help an individual adjust 
himself to some particular problem or situation. 

The advantages of the interview are that (a) it often gives a more 
meaningful response than a written questionnaire because the exam- 
iner can follow up "leads" during an interview, (b) it permits the 
investigator not only to collect attitudes, likes and dislikes but enables 
him to find out the reasons for the responses, and (c) the interviewer 
deals with the whole personality of the respondent, not only the as- 
pects manifested by written communication. Disadvantages of the 
interview are that (a) the examiner, or interviewer, may give sugges- 
tions which will condition the reply of the person being interviewed, 
(b) an individual may be influenced not only by the questions but 
by the expression, the gestures, or the tone of the questioner, and (c) 
the interview is time-consuming and its results are rarely able to be 
treated except as informal and supplementary evidence. 

A good interview will be well planned and the interviewer will 
secure rapport with the individual being interviewed. Frequently this 
can be accomplished by an appeal to some interest of the person who 
is being interviewed. Attempts will be made to lessen tensions and 
the interviewer will keep the discussion to the main issue. The choice 
of questions will be such that they will influence to a minimum degree 
the responses which an individual makes. The records of such inter- 
views will be written, wherever possible, immediately following the 
interview so that there is an immediate record for future reference. 

; An extensive discussion of the interview as an instrument for assess- 
ing adjustment may be found in Chapter Eight. 


(g) Sociometric Techniques 


Sociometry is the name of an area of study dealing with the 
psychological structures of human groups, communities 
large. Sociometric techniques have recently been intr 
school following the pioneering work of Moreno 
etry postulates that groups have structures whi 
interpersonal patterns. These relationships 
qualitative and quantitative procedures, T 
within the structure and the nature of h 
children may be revealed by these tech 
largely a function or a corollary of the 
techniques have come to play an import: 
adjustment. It is desirable, of course, th 
vidual may have with another individu 


actual 
, and society at 
oduced into the 
and Jennings. Sociom- 
ch consist of complex 
are accessible to study by 
he position of every child 
is relationships with other 
niques. Since adjustment is 
se relationships, sociometric 
ant role in the evaluation of 
at every relationship an indi- 
al be considered concretely, a 


Evaluating Personal-Social Adjustment 349 


fact which limits the scope of the investigation to a certain setting 
such as the school, a community, or a work group. 

Sociometric techniques endeavor to discover individuals in situa- 
tions where they spontaneously uncover their relationships. For in- 
stance, they require an individual to choose associates for any group 
of which he may be or may become a member. The techniques used 
are varied, but their common point of departure consists in asking 
individuals to choose or reject other persons on the basis of specific 
criteria or for specific purposes. The individual chooses others in 
order of preference, first, second, or third. Children in the classroom 
may be asked questions such as these: “With whom would you like 
to sit at the same table?" "Who is your best friend?" "With whom 
would you like to work on the same committee?" *With whom would 
you like to play?" 

The sociometric test may be followed by an interview in which the 
child is asked to state the reasons for choosing certain children and 
rejecting others. 

The results of the sociometric test may be charted in the form of 
a graph known as the sociogram, which represents the patterns of 
attractions and repulsions within the group studied. It makes visible 
and identifiable the relationship of every individual to every other 
individual in the group. This is done by representing each individual 
in the group by a symbol and then drawing lines indicating the rela- 
tions, positive, negative, or indifferent, of each individual to other in- 
dividuals, The lines may be of different colors to indicate attraction 
or repulsion or arrows may be used for this purpose. 

Once the chart has been constructed a number of insights may 
emerge regarding the adjustment problems of certain children. For 
instance, the chart may show the “isolates,” that is, children who are 
not chosen by others for any activity or as friends. On the chart these 
may be represented by outgoing arrows only. The teacher may use 
the information given by the sociogram for remedial work. He may 
encourage other children, chosen by the isolate for certain purposes, 
to reciprocate the feeling. 

The chart may show “mutual choices and mutual rejections,” or 
identify small groups within the larger groups which function as self- 
sufficient cliques. There may be insight into who the leaders of the 
group are, and these leaders then may be utilized to promote the 
personal and social growth of all the children. 

Sociometrie techniques thus seem promising evaluative tools. Nev- 
ertheless, certain cautions are necessary. The sociometric test itself 


350 Evaluating Major Objectives and Situations 


may have adverse effects upon some of the children tested. It may 
bring about an awareness of isolation and rejection which had not 
been present before. Too often sociometric techniques are used in a 
mechanical fashion. This means that the children are asked to make 
choices without indicating their reasons for choosing. The picture 
emerging thus may be interpreted by the teacher incorrectly and lead 
to further problems. 

The attractions and rejections are reported on the same level of 
intensity. There is no way for the individual to show very strong 
emotion as contrasted with weak emotion. The teacher cannot dis- 
tinguish between a child who is simply rejected or one who is deeply 
disliked by some of the others. 

There is grave danger of premature manipulation of the classroom 
Situation by the teacher following the application of a sociometric 
technique. The philosophy of adjustment held by the teacher plays 
an important role in the decisions he makes on the basis of the socio- 
gram. One teacher may feel that under no condition should isolates 
be permitted to continue in this psychological position. Such a teacher 
may force the isolate into social contacts for which he is not ready, 
or which are less creative for him than the activities he has engaged 
in for himself. 

Despite these objections sociometric techniques provide an avenue 
for exploring the invisible structures of groups. They permit clear and 
graphic presentation of personal relations, Data yielded by these tech- 
niques can be obtained by separate investigators and will compare 
well. Retesting at intervals will make possible an evaluation of the 
progress of the group toward the goals formulated by the teacher in 
the area of personal-social adjustment. 

The variety of sociometric techniques at present employed and 
their role in evaluation is the subject of Chapter Eleven. 


(h) Situational Tests 


Psychiatrists and psychologists have come to realize that in addi- 
tion to knowledge of the biological make-up of individuals and the 
basic experiences they have with their parents, a study of the situa- 
tions that act upon the individual is fundamental for understanding 
and evaluating his adjustment patterns. The Situation makes demands 
upon the individual. He learns what is stan 
havior for certain situations; he learns whi 
number of situations. *Adjustment" 
would essentially be related to ada: 


Evaluating Personal-Social Adjustment 351 


situations into which one moves during a day, a week, or a year. Good 
adjustment would consist in an individual being able to perform ade- 
quately in the many roles assigned to him. For instance, a child may 
have to play the role of an obedient child at home, a cooperative 
pupil at school, and a courageous leader on the playground. 

Situational tests permit the diagnosis of a child's adjustment pat- 
terns and role adequacy by putting him through a variety of situa- 
tions and observing his behavior in them. In the Hartshorne and May 
researches (4), children were studied with reference to their honesty, 
generosity, and self-control by being placed in situations in which 
they might lie, steal, or cheat, or might refuse to do these things. They 
participated in situations in which they might spend time, effort, ar 
money on themselves or on other children, and in situations in which 
they mght resist distractions or yield to them. 

The theory of situationism has influenced the development of situ- 
ational tests. In contrast to personality inventories, projective tech- 
niques, and personal reports, situational tests evaluate the child's 
adjustment during his actual behavior in the situation. In a sense they 
present a short cut to long range observation and the keeping of 
cumulative anecdotal records. The problems inherent in their usage, 
however, make them a difficult tool for the classroom teacher. 

'The essence of the situational test is to put the child through a 
replica of a situation which he has faced or might face and in the 
reaction to which one is interested. The teacher or counselor may wish 
to know how a child will adjust to a camp situation, a play situation, 
Or a work situation. A personality inventory may reveal whether the 
child has problems, and it may give a numerical score in emotional 
adjustment. It does not reveal, however, how the child will behave 
in the specific camp, play, or work situation. In the situational test 
the teacher, with the help of other children and adults, may con- 
Struct as many elements of the situation to be faced as desired. The 
child is then told what the situation is and asked to behave as he 
would when actually confronted by it. Actually the child is asked to 
“act out” his role in the new or old situation. This emphasis upon 
“acting” in situationism has been recognized in the development of 
psychodrama and more recently of sociodrama (2). Sociodrama as 
a diagnostic technique realizes that man is a “role player,” that every 
individual is characterized by a certain range of roles which domi- 
nate his behavior and these roles are imposed upon the individual with 
varying success by the society. 

Sociodrama may be used in many ways. It may be employed as a 


352 Evaluating Major Objectives and. Situations 


situational test in which the child is asked to play a part in a variety 
of situations. Or, sociodrama may be used to give insight to a group 
as a whole. By watching and identifying themselves with the actions 
of a particular child, the members of the group may become aware 
of adjustment problems in general. Sociodramatic techniques lend 
themselves to subsequent discussion which encourages the emotional 
and social growth of all children. The teacher may tell the group, 
“Well, you have seen how Johnnie behaved to the counselor in the 
camp, would you have behaved differently?" or “Why do you think 
Tommy and George did not get along while playing?" 

The foregoing discussion indicates that situational tests take many 
forms and may be used to evaluate many aspects of adjustment. 
Specific traits such as honesty and reliability may be assessed by these 
techniques. In addition the keen observer may learn about gross pat- 


terns of social adjustment of children in many situations under dif- 
ferent conditions. 


Situational tests, however, need elabor 
ments. Their direction requires skill and 
of data yielded by them has many ramifications. The classroom teacher 
will wish to experiment with these techniques occasionally but may 
feel that as continuous instruments of evaluation they are too laborious. 


ate preparation and arrange- 
planning, and interpretation 


Summary 


For an intelligent applic 
discussed in this chapter the teacher needs som 
psychology and mental hygiene. The metho 
adapted to serve teachers as evalu 
personal-social adjustment are: ( 
rating scales, (c) anecdotal records, a 
It is advisable to rely on a combinati 
roborative evidence. In this w: 
technique are minimized, and 
to the validity of the inferences and conclusions, 

Interpretation of the data is ; 
worker or is reached by ee Pede ie oe acd 


and discussion among teachers. 
the teacher sho 


ation and employment of the techniques 


€ preparation in child 


ay the limitations 


inherent in any one 
some degree of cer 


tainty is achieved as 


Evaluating Personal-Social Adjustment 353 


couraging of emotional growth of the children. If we accept this idea, 
remedial action based upon the evidence obtained through the assess- 
ment of adjustment becomes a part of the teaching job. 

Adjustment problems may be of many kinds. Mild adjustment prob- 
lems can be aided and remedied by the classroom teachers. For in- 
stance, Myers (5) has formulated six types of children who have 
problems which are amenable to guidance attempts by the teacher. 
First, there is the “unsociable child" who isolates himself, likes to play 
by himself, withdraws from the activities which would involve him 
with other children. Second, the *model" child may have a mild ad- 
justment problem which needs the attention of the teacher. The neat- 
ness, conscientiousness, and courtesy of such a child may conceal an 
inordinate need for approval. The "nervous child" is the third type 
in this classification. Myers includes children who are timid, anxious, 
shy, irritable, tense. This type of child may have developed a number 
of semi-neurotic habits such as nail biting, shrugging, twitching, pick- 
ing, scratching, and others. The remaining two of Myers' types are the 
"defensive" and the “emotional” child. The latter is unable to adjust 
because of inability or unwillingness to repress emotion. 

Any teacher with even limited experience would be able to give a 
long list of behavior disturbances which he meets daily. He would also 
be able to give evidence of changes which had taken place as a result 
of his efforts. There are, however, problems which are qualitatively 
different from the ones mentioned—severe cases of complete inability 
to respond. Often the teacher can recognize such problems without the 
help of evaluation devices. Sometimes they will come to the teacher's 
attention through children's answers on a personality test or their be- 
havior in an expressive situation. Children with such problems should 
be referred for analysis and treatment to a specialist or a team of spe- 
cialists such as psychologists, social workers, and psychiatrists. 


Problems for Class Discussion 


l. Use a behavior rating scale to analyze and compare the personal char- 
acteristics of several children. 

2. Keep anecdotal records of the personal and social behavior of two chil- 
dren for a period of two or three weeks. Analyze these records to discover 
patterns of behavior. 

3. Administer a personality inventory to a pupil and study the results. On 
the basis of this study, plan an interview designed to analyze personal 
and social characteristics which merit further study. 


354 Evaluating Major Objectives and Situations 


References Cited in This Chapter 


1. Alschuler, Rose H., and Hattwick, La Berta A., “Easel Painting as an 
Index of Personality in Preschool Children," American Journal of Ortho- 
psychiatry, 13:616—625, October, 1943. 

2. Bell, J. E., Projective Techniques. New York: Longmans, Green and Co., 
1948. 

8. Coombs, Arthur W., “The Validity and Reliability of Interpretations 
from Autobiographies and Thematic Apperception Test,” Journal of 
Clinical Psychology, 2:240-247, July, 1946. 

4. Hartshorne, Hugh, and May, Mark A., Studies in Service and Self- 
Control. New York: The Macmillan Co., 1929. 


5. Myers, C. R., Toward Mental Health in School. Toronto: University of 
Toronto Press, 1939. 


. Olson, W. C., “Personality,” Encyclopedia of Educational Research, 
p. 806-817. New York: The Macmillan Co., 1950. 


Rohde, Amanda A., “Explorations in Personality by the Sentence Com- 
pletion Method," Journal of Applied Psychology, 30:169-181, April, 
1946. 


8. Rosenzweig, Saul, “The Picture-Association Method and Its Application 
in a Study of Reactions to Frustration,” Journal of Personality, 14:3-23, 
September, 1945. 


References for Further Reading 


Anderson, H. H. and G. L., editors, An Introduction to Projective Tech- 
niques. New York: Parker Publishing Co., 1952, 
This volume presents various clinical tests for di 
including the Rorschach, Thematic Apperception T 
sentence completion, Szondi Test, psychodrama, 


agnosing personality, 
est, word association, 


raphology, fi int- 

ing, and drawings of the human form. a nati te 
Symonds, P. M., Diagnosing Personality and Conduct. Ne York: D. Apple- 
ton-Century Co., 1931. =e ee 


An extensive survey of all methods of 
duct is provided. Observation, rat 
free association and projective tech 
other methods are described. 


Traxler, A. E., The Use of Tests and Ratin, 


g Devices in the A isal 0; 
Personality. New York: Educational Records Bureau, 1938. ean 


0 diagnosing personality and con- 
ing scales, self-descriptive inventories, 
niques, situational tests, interview, and 


This bulletin explains in nontechnical fashion the advantages and 
limitations of tests and rating devices that may be used in the usual school 
situation for appraisal of personality. An annotated list of tests and rating 
scales is provided. 


Evaluating 
Attitudes and. Values 


CHAPTER NINETEEN 


When the teacher realizes that the attitude towards 
arithmetic affects one's learning of arithmetical information and skills, 
that the attitude towards books affects one's ability to learn to read, 
and that the attitude towards school and its various phases affects one's 
ability to learn in school, then the teacher is in a position to appreciate 
the fundamental importance of attitudes in education. Since a favor- 
able attitude towards a person, object, or activity connected with 
school is more likely to motivate a person to do well in school and 
Since negative attitudes towards school serve to hamper maximum 
learning, the teacher should be aware of the attitudes of pupils toward 
these factors which figure so significantly in the educational process. 
The importance of attitudes, therefore, lies in their close relation to 
the efficiency of the learning process. 

Attitudes and values are important also as indicators of how one 
can expect people to behave in future situations. The student who 
is prejudiced against "foreigners" can be expected to avoid contacts 
in or out of class with those whose accents differ from his own. From 
the positive angle, the student who has a tolerant and favorable atti- 
tude toward members of minority groups can be expected to share ac- 
tivities with such individuals in his classroom and in out-of-school 
situations. 

The importance of attitudes and values should be recognized, too, 
in the whole process of problem-solving. When someone has a prob- 
lem, it means that he is faced with a situation in which he does not 
know what to do. He cannot act in an habitual way because this situ- 
ation is different from anything in his previous experience. He can- 
not decide what to do because he does not know what means or course 
of action will result in the attainment of his goal or end. He searches 

355 


356 Evaluating Major Objectives and Situations 


around for the various alternatives before him, and then asks himself 
what consequences will follow if he pursues each of his alternatives. 
Finally, he must try out one of the alternatives as the best possible 
course of action to attain his goals. If the goal is not an immediate 
one, it is based upon something deeper, his values. In some cases, 
the mere attainment of certain goals is not enough if one's values are 
disregarded in the process. The importance of values, therefore, is 
that they provide criteria for the resolution of daily problems. Values 
serve to help people think through what they ought to do, for thinking 
is this process of finding and testing meanings in problem situations. 
Values, then, may be looked upon as underlying thinking. 

The importance of attitudes in the learning process and in living 
compels the modern school to take cognizance of attitudes in its ob- 
jectives. The effective teacher must attempt to integrate attitudes with 
the development of knowledge, skill, and interests. More specifically, 
the teacher should seek to achieve the following objectives: (1) to 
encourage desirable attitudes, (2) to discourage undesirable attitudes, 
and (3) to promote new and broader attitudes. Acceptance of these 
three objectives assumes that the teacher can determine what the 


present attitudes of his pupils are. This chapter discusses some stand- 
ardized and non-standardized evaluative techniques in this area. 


The Definition of Attitudes and Values 


A comprehensive description of the me 
in the Dictionary of Education (5) w 
of mental and emotional readiness to react to situations, persons, or 
things in a manner in harmony with a habitual pattern of response 
previously conditioned to or associated with these stimul 
vestigators regard attitudes as a feeling or predisposition to favor or 
be against objects, ideas, persons, or groups. In this connection one 
may study attitudes as either-or (for or against something) or as a 
matter of degree (favor it highly, favor it to some extent, be indif- 
ferent, dislike to some extent, and dislike it highly) 

There is a great deal of controvi 
meaning of values. Friedman (4) 
philosophers have defined values. 
such words as feelings, 
combination. 


aning of attitude is presented 
hich defines attitude as “a state 


i" Some in- 


ersy among various experts on the 
lists sixteen different ways in which 
In general, these definitions include 
attitudes, preferences, and action, alone or in 


If a difference truly exists between the two terms, it might be found 


Evaluating Attitudes and. Values 357 


in the theory that attitudes are more specific and numerous while 
values are more general and fewer. In other words, there may be fa- 
vorable attitudes towards various kinds of people, a separate attitude 
for each type of person, but the value which would apply might be 
the generalized ideal, a respect for the personality of all individuals. 
The value is more comprehensive and may help to explain the opera- 
tions of many specific attitudes. Stated another way, values are those 
few fundamental generalized ideals which can serve to explain the 
rational behavior of man. Typical illustrations of values generally asso- 
ciated with democracy as an ideal are respect for personality, faith 
in intelligence, equality of opportunity and treatment, government 
under laws made by the people, freedom of speech, press, assembly, 
and religion, etc. 

Attitudes and. values should be seen not as separate and distinct 
from such elements as interests, needs, aptitudes, and emotions, but 
rather as interrelated factors in the nature and development of the 
total person. Favorable interests in tennis may fulfill a need for physi- 
cal exertion. This interest may create favorable attitudes towards 
people who play tennis and so may serve to promote a value of respect 
for all, regardless of race or religion, etc. Pleasant associations may 
encourage favorable emotions towards minority peoples. 

In learning situations, new information and knowledge may lead 
to a change in attitudes and values. Also, the development of pro- 
ficiency in any given skill may result in favorable attitudes towards a 
particular classroom activity such as reading, painting, or arithmetic. 


The Evaluation of Attitudes and Values 


At the nursery- and lower-elementary-school level, the teacher must 
play a dominant role in evaluating attitudes associated with the school 
program. Since the nursery- and elementary-school program includes 
objectives in health habits, social adjustment, physical skills, interests, 
and parent education as well as fundamental skills, the teacher should 
develop a system of anecdotal records, checklists, rating scales, logs, 
and observational notes on the attitudes of his children and their par- 
ents, Insofar as the teacher at this level is clear on the proper attitudes 
towards various aspects of child life (eating, sleeping, bedwetting, 
family relationships, playing. etc.), he may teach these attitudes to 
children and parents. As the children and parents become familiar 
with these attitudes, they serve as criteria for evaluating self and 


358 Evaluating Major Objectives and Situations 


others. Gradually, therefore, as the child learns about one or two cri- 
teria, he can apply them to the behavior of his imaginary friends or 
dolls or toys, etc. This constant application of criteria serves to develop 
evaluative skills in child and parent. Through verbal techniques of 
interview and discussion, the teacher can promote evaluation skills for 
self, colleagues, children, and parents. 

At the upper-elementary-school level, while the teacher still plays 
a dominant role in the evaluation process, the increasingly mature stu- 
dents are in a position to assume more and more responsibility for 
self-evaluation and group evaluation. Also, with increased social 
adjustment, children are more capable of committee work and cooper- 
ative evaluation. By the junior- and senior-high-school levels, assuming 
previous evaluation experiences, one can expect pupils under teacher 
guidance and with teacher cooperation to evaluate with higher de- 
grees of skill and efficiency. At the junior- and senior-high-school levels, 
when school projects may involve parents, community groups, and 
agencies, evaluation may include not only teacher and pupils but all 
those who have shared in the educational experiences. 

There are two general means which teachers might employ to evalu- 
ate attitudes and values. The teacher may be able to find standardized 
attitude tests which are appropriate to his purposes. Ordinarily, how- 
ever, the teacher will be compelled to construct devices of 
order to meet the need for a particular type of test to suit his own 
unique group. Very often, too, the teacher will find that he must 
rely on devices which he constructs himself because the available 
budget for testing materials cannot be extended to include more than 
intelligence and achievement tests. Because of these practical con- 


siderations, the following discussion will emphasize teacher- 
devices. 


his own in 


made 


STANDARDIZED TESTS AND TECHNIQUES 


Anyone who consults avail 
published test materials such 
(2, 8) will be struck by the 
area of attitudes which are s 
Some of the more common 
measure attitudes and value. 


able sources of information concerning 
as the Mental Measurement Yearbooks 
paucity of published instruments in the 
uitable for use below the college level. 
approaches which have been utilized to 
s are described below. 


(a) The Bell School Inventory (Published by the Stanford University 
Press). The purpose of this inventory is to determine the attitude of 


Evaluating Attitudes and. Values 359 


high-school students towards school. Approximately two-thirds of the 
seventy-six items deal with student reactions to teacher personality 
and teacher-pupil relationships. The remainder are directed to such 
aspects of school life as discipline, classmates, curricular and extra- 
curricular activities, and marking systems. Typical items are: 


Do you have difficulty in keeping your mind on what you are 
studying? 
Do you think that some of your teachers are narrow-minded? 


Do you think that some of your teachers act as if they were 
bored with their work? 

Do you think that some of your teachers lack a sense of 
humor? 


Student responses take the form of drawing a circle around Yes, 
No, or ?. The inventory is scored by placing a transparent stencil 
over the answer column and counting the number of items indicating 
“dissatisfaction” which the student has circled. Norms are provided 
which enable the scorer to rate student satisfaction on a scale ranging 
from “excellent” to “very unsatisfactory.” 

Like all instruments of this type, the validity of the inventory is 
an outgrowth of the student’s truthfulness. In many instances, the 
score obtained by different students may reflect not their relative dis- 
satisfaction with school, but their relative reticence. In this particular 
scale, too, the reliance upon a single score makes it difficult to de- 
termine those areas of school life which are most conducive to lack 
of satisfaction. Inventories of this type probably have their greatest 
value when used as a preliminary approach to an interview with an 
individual student. 


(b) Remmers Attitude Scales (Published by the Division of Educa- 
tional Reference, Purdue University). Under the editorship of H. H. 
Remmers of Purdue University, many different attitude scales have 
been constructed. These scales purport to measure such generalized 
attitudes as those toward (1) any disciplinary procedure, (2) any 
national or racial group, (3) any school subject, (4) any vocation, 
(5) any teacher, and (6) any play. The student is asked to indicate 
his agreement with a series of phrases or sentences, and a total rating, 
based upon a composite of the weights assigned to the checked items, 
is obtained. A typical scale is reproduced below in part: 


360 Evaluating Major Objectives and. Situations 


À SCALE FOR MEASURING ATTITUDE TOWARD ANY TEACHER 
L. D. Hoshaw 


Name of teacher to be rated. 


Directions: The following is a list of statements about teachers. 
Place a plus sign (+) before each statement with which you agree 
with reference to the teacher whose name appears in the blank 
above. Mark only those statements which you know to be true 
about the teacher. Your score will in no way affect your grade in 
any course. 


1. Is perfect in every way 

2. Makes the subject interesting. 

8. Grades papers fairly. 

4. Is an aid in developing high ideals. 

5. Is always polite. 

6. Is always pleasant. 

T. Is one of the best citizens in the community. 
8. Is brilliant. 

9. Can talk well on many subjects. 

— ——10. Is progressive. 

— — 1. Is interested in the school activities. 

- Inspires respect on the part of the students. 
— — 18. Is natural and unaffected. 

— —4. Has the moral support of the community. 
—— 15. Has a reason back of every request. 

— ——16. Is energetic. 

——17. Will admit an error if convinced. 

——18. Dresses well. 


In 


Here, too, the usefulness of the instrument is limited by the truth- 


fulness of the student's responses. The inclusion of extreme statements: 
"Is perfect in every way," and uncontested statements: “Has the moral 
support of the community,” also detract from the accuracy of scores 
which are obtained through the use of the scale, 


(c) Test of Beliefs on Social Issues ( 
by the Educational Testing Service) 
structed for use in the Eight-Y 
print. It is composed of two se 
ing with the areas of democracy, economic relations, labor and unem- 
ployment, race, nationalism, and militarism. The two sections, which 
are administered at different times, are so arranged that opposing 
points of view are presented in contrasting items. Sample items drawn 
from the two sections of the test are presented below: 


Forms 4.21 and 4.31; published 
- This scale was originally con- 
ear Study (6) though it is now out of 
parate sections of 200 statements deal- 


Evaluating Attitudes and Values 361 


(4.91) 14. Most workers who are unable to provide for 
themselves» during a period of unemployment 
have been too shiftless to save. 

(D. Liberal; A. Conservative; U. Uncertain ) 

(4.31) 104. The wages of most workers are so low that it is 
impossible for them to save enough money to 
support themselves during periods of unem- 
ployment. 

(A. Liberal; D. Conservative; U. Uncertain) 


As the scoring key given with the sample items indicates, the scale 
yields scores of liberalism, conservatism, and degree of uncertainty. 
In addition, a measure of consistency, determined by the extent to 
which the student expresses the same point of view on both sections of 
the scale, may be obtained. The nature of the topics covered by the 
Test of Social Beliefs and the novel consistency aspect of the scoring 
scheme make this instrument extremely valuable for teaching pur- 
poses. 


(d) Test of Social Problems (Form 1.42; published by the Educational 
Testing Service). Another novel approach is represented in this instru- 
ment, which was also constructed for the Eight-Year Study (6) and is 
now out of print. The test is designed to determine the value principles 
used by high-school students in their solution of social problems. It 
contains six problems with about twenty-five reasons for selecting a 
number of possible solutions to each problem. The student is requested 
to select a course of action (which reflects a particular attitude or 
set of values by inference) and then to choose the reasons for his 
course of action. A part of one of the problems is reproduced below: 


Cotton has been picked by hand, which is a slow and expen- 
sive process. Recently, the Rust brothers invented a machine 
to do this work. It would pick in 714 hours as much cotton as 
one hand-picker could pick over a whole season of eleven 
weeks. The cost of production of cotton could be reduced 
from $14.52 to $3.00 per bale. To date this machine has not 
been placed on the market. What should be done with this 
machine? 


Courses of Action 


l. The invention should be made available for unrestricted 
manufacture and sale of the machine ds 


2. The machine should be manufactured and sold under 
some form of government control and provisions made for 


362 Evaluating Major Objectives and. Situations 


establishing in other jobs the cotton pickers who are thrown 
out of work......... AME : a) 


8. Workers and cotton growers should form a cooperative 
and should use the profits from it to take care of the people 
thrown out of work by the machine..... 


— SA 3. 
4. The machine should not be put to use at the present time 
at all T ore ae EE E y avers ici 


Reasons 


l. In business the efficiency of production should be con- 
sidered ahead of anything else......... — UN 


2. Uncontrolled use of the machine would give an advantage 


to the few large landowners in the south over the poor land- 
owners 


8. Uncontrolled introduction of a labor-saving machine 
throws large numbers out of work T NEA 


oduced, the people 
in making the new 
WMS Zia GER T Sage e Sassy fas CMS 4. 


5. By being able to reduce the price of cotton production, 
the American cotton growers can sell more on foreign mar- 
kets and thu 


s increase their income from exports... . .,,_____5, 


4. When labor-saving machines are intr 


who are displaced by them find work 
machines 


(e) Contemporary Problems (Published by the Bureau of Publica- 
tions, Teachers College, Columbia Univ.). This test is designed to 
measure the junior-high-school pupil’s tendency to choose democratic 
methods in preference to undemocratic methods in evaluating and 
solving social problems. While most scales ask the student to take a 
position for or against given social issues, the Contemporary Prob- 
lems Test is unusual in that it asks the pupil to choose one of several 
methods of acting to solve a social problem, A single social situation 


is used as the starting point for the presentation of a number of test 
items, thus: 


an in just studying about 
farm products from books. Mr. Josephs, the te L 


acher, told the class 
that people at Bruce Corners wanted them to come, 


Evaluating Attitudes and Values 


"Let's have a committee look into the problem of taking the 
trip,” suggested Irene. Most of the members of the class liked 
Irene's suggestion. Jane asked, “How should we pick good com- 


mittee members?" 
(1) Which one of the following statements should be given 
MOST consideration when picking members for this trip com- 
mittee? 
A. Any member of the class could belong to the committee. 
This problem concerns everybody in the class. 
B. Only the smartest students should be on the committee. 
C. Students who always go on trips would be best because 
they like school and they do better in school. 
D. The committee members should be picked by Mr. 
Josephs, the teacher. 
E. Students whose parents are interested in school should 
be picked. 
The class finally picked a trip committee. The committee went 
over many of the details of the trip. Dave, a member of the com- 
mittee, reported that to stay for a week would cost about $15.00 


each. “Fifteen bucks,” said Dave, “is a lot of money! A lot of us 
couldn't go!” Dave urged the committee to ask the class NOT to 


take the trip because not everybody could afford to go. 
(2) Which one of the following is the BEST reason for NOT 
taking the trip? 
F. Taking the trip would split the class into rich and poor. 
G. Everybody would not be getting a chance to do some- 
thing interesting. 
H. The students who can't afford to go need to take the 
trip. Rich students can always travel on their own. 


I. Unless everybody does the same thing, it’s undemo- 


cratic. 
J. The teacher should not plan a trip unless all the students 
in the class have enough money to go. 


Teacher-Made Techniques 


Two types of techniques 
is interested in evaluating th 


363 


are generally available to the teacher who 
e attitudes of his pupils. These approaches 


may be made through observational methods and through pencil- 


and-paper practices. 


364 Evaluating Major Objectives and Situations 
OBSERVATIONAL TECHNIQUES IN THE LOWER. GRADES 


Teacher-made techniques at the nursery-school, kindergarten, and 
primary-school levels are generally limited to the observation of child 
behavior. If teachers are continuously sensitive to what the child says. 
the activities the child selects and how he participates, the "faces" 
the child makes, what the child does in playing with blocks and in 
painting, the imaginary or real stories he tells, the wishes he may ex- 
press, and the informal, incidental, and unsolicited comments of other 
children about each child, they have the raw material out of which 
a picture of the attitudes and values of a child can be drawn. In 
all of these situations the role of the teacher is primarily that of an 
observer. The teacher may also initiate activities or ask children to 
respond individually or as a group to a picture, a drawing, a song, 
etc., in order to elicit attitudes towards these objects or their makers. 

Since it is not humanly possible for the teacher to remember every 
significant thing about each child in the room, recording techniques 
are necessary. Several techniques may be utilized to help the teacher 
keep a permanent and accurate record of child behavior. The teacher 
may use separately or in combination, an anecdotal record, an inter- 
view, a checklist, a rating scale, a log, or a mechanical device such 
as a motion picture or recording. Illustrative samples of possible 
teacher-made devices for each of the first five techniques are pre- 
sented below. At the present time, motion pictures and recordings are 
too expensive and impractical for the average school. 


The Anecdotal Record (A more complete discussion of anecdotal 
records may be found in Chapter Seven.) A student teacher in a kin- 


dergarten class developed a set of anecdotal records around six ob- 
jectives: 


1; 


Understanding and practicing desirable social relationships 
2. 


Discovery and development of desirable individual atti- 
tudes 


8. Appreciation of and desire for worthwhile activities 


4. Command of common integrating skills and kn 


5. Development of sound bod 
tudes 


6. Development of good health habits 


owledges 
y and desirable mental atti- 


A typical record dealing with a child's attitude toward cleanliness is 
the following (the card is 4 x 6): 


Evaluating Attitudes and Values 365 


4-5-55 


Jackie 


= 
bo 
e 
A 

Ey 
e 


Swept cookie crumbs off her table and 
into her empty juice cup. 


Comments: Typical behavior, keeps herself and 
her things neat and clean 


a. 
b. 
Cc. 
[al 
e. 
f. 
g 
h. 


The 1-6 in the upper-left-hand corner is a code for the six objectives: 
the enclosed 5 indicates that this incident illustrates objective 5. The 
date and name of the child is in the right-hand corner. The letters 
down the side refer to different times of the daily schedule, e.g., d is 
the morning refreshment period. 

On page 366 is a more elaborate anecdotal record used by a nursery- 
school teacher to study the resting behavior of a group of children 
by checking the activities which preceded the rest period (see left 
column). 

Included on the sheet (8 x 12) was the code for name (center), date 
(upper right), the activity teacher engaged in while children rested 
(right column). Notice that R.S. had developed a favorable attitude 
toward the rest period for he cooperated and desired others to do so 
too. 


The Interview The interview is one of the most valuable techniques 
the teacher can employ. In an interview the teacher can “see” behind 
obvious surface statements and follow up questionable points or prom- 
ising leads. The interview might be used to find out why a child has 
antagonistic or accepting attitudes towards various people or activi- 
ties. In all cases, it might also be wise to interview parents, because 
they are a source of much enlightening data about their children. 
After an interview with a child in the second grade, a teacher might 
desire to record as note, log, or anecdote the following remark: 


I don't like to read books. I want to be like my father and he 
never reads books. I like to play ball. 


366 


S.C. H.M. PS. 


Activity Before Rest 


Inside 


Sandbox 

Blocks 

Doll corner 
Books 

Puzzles 

Music 

Rhythms 
Painting 
Fingerpaint 
Waterplay 
Blackboard 
Climbing frame 
Pull toys 

Dress up 
Hammer & nails 
Clay 

Floor toys 
Beads 

Pulley from ceiling 


Outside 


Bikes 
Wagon 
Boxes 


Playhouse 
Jungle Jim 
Slide 

Sandbox 

Soap bubbles 
Running games 


Planks 


Evaluating Major Objectives and Situations 


TP. P.F. LS. G.T. April 27 


Lay down right away and 
listened intently to the 
stories. In about the middle 
of the second story he sat 
up. He talked a great deal 
about the pictures. He even 
requested a book that he 
wanted read. For the big 
resters game he rested very 
quietly. When T.P. refused 
to put her feet down for the 
game, he looked over and 
said, “You better lay down 
or you won't be able to get 
up from rest.” 


Teacher Activity 


Setting Tables 
Put Toys Away 
Finger Plays 
Story 
Blackboard 
Student Teacher 


Evaluating Attitudes and. Values 367 


An antagonistic attitude towards reading is evident. A series of such 
remarks and actual avoidance behavior in class all would add up to 
confirm the teacher in his appraisal of the child's attitude. 


The Checklist The checklist is a fine device for reassuring the 
teacher that he is noting all the things he may have neglected through 
less formal observational techniques. A simple checklist might be de- 
signed to secure a sampling of attitudes towards food, people, games 
and activities, and health habits. For instance, the teacher might use 
the check list on page 368 as a source of information for a subsequent 
letter home to a child's parent in place of a traditional report card. 


The Log The log is a convenient method of recording daily events. 
For instance, all the teacher has to do is to note the date and a quick 
summary of the attitude expressed by a pupil during that day. A series 
of entries in a log book might be: 


September 9 James N. said he didn't like to sit near Jane, 
a member of a minority race. Jane knows she 
is not liked by James N. 

September 20 James N. invited Jane to be a member of his 
refreshment committee for the school party. 
For the past two weeks, our efforts to get 
James N. to appreciate Jane are succeeding. 

January 10 James N. selected Jane to be the farmer's 
wife in the game, "The Farmer in the Dell." 
I haven't noticed any signs of antagonism 
towards Jane since September. 


OBSERVATIONAL TECHNIQUES IN THE UPPER GRADES 


Teacher-made techniques at the upper-elementary and junior- and 
senior-high-school levels include many approaches. These may be 
classified under three general headings. The first includes techniques 
which represent direct observational reporting of actual daily behavior. 
The second includes those approaches in which a stimulus is pre- 
sented to students in order to secure their responses. The third is the 
use of make-believe dramatic situations, or socio-dramatic techniques 
as they are called. 


Observing Actual Behavior Measuring true behavior, rather than 
what people say they would do, is a fundamentally sound goal in eval- 


368 


Evaluating Major Objectives and Situations 


NURSERY SCHOOL CHECKLIST 


ATTITUDES TOWARD 


A. Food 


. Juices 


SUSAN SMITH 


IN- UN- 
DIFFERENT FAVORABLE 


FAVORABLE 


Date * Date Date 


. Soups 


. Meats 


. Fish 


. Vegetables 


. Desserts 
. Milk 


B. People 


1. Teacher 


2. Parent 


3. Boys 


. Girls 


. Visitors 


. Sandbox 


. Painting 


3. Playing House 


- Workbench 


. Blocks 


D. (Write In Others) 


* Insert date in box nezt to each item when observed. 


Evaluating Attitudes and. Values 369 


uation. It is likely that this criterion for valid measurement will be 
increasingly met as the school curriculum shifts from an emphasis on 
subject-matter learning to stress upon activities at the level of the 
child in and out of the classroom. If the teacher were interested, for 
example, in the attitude of a child toward the Negro, instead of ad- 
ministering a pencil-and-paper test, the class might visit a Negro 
home, church meeting, or social gathering in order to observe first- 
hand the true reactions of the pupils. Of course, one may question 
whether the child will behave in the same fashion if the visit were a 
voluntary one, and the presence of the teacher may or may not affect 
the behavior of the student. 


Presentation of Stimuli The attempt to secure student expressions 
of attitude and value is also promoted by means of having the student 
react to various forms of stimuli. Teachers of different subjects and at 
different levels may use the following approaches to stimulate pupil 
responses in the form of discussions, panels, speeches, essays, letters, 
etc.: (1) show a movie, (2) play a recording, (8) read a part of a 
letter, magazine article, or news article, (4) play the radio, (5) go to 
a play, (6) present a panel, a speech, a forum, a debate, or discussion, 
(7) present posters, (8) read advertisements, (9) examine interna- 
tional, national, or local reports, (10) conduct experimental studies in 
laboratory course, (11) show a picture or present any specific event 
(television or first-hand observation). The responses of students can 
be recorded in any of the ways already outlined. 


The Sociodrama A third means of ascertaining attitudes and values 
of students is relatively new to the teaching field. This is the socio- 
drama. While this has been a profitable technique in psychotherapy 
n mental hospitals, little research literature 
of this technique is available. In this tech- 
acters assume a role and spontaneously 
carry out the dramatic situation to an end. The assumption is that 
people will project their true feelings, attitudes, and values freely 
when they are actively involved in a spontaneous situation. For in- 
stance, in a social studies class the teacher might have one youngster 
play an industrial worker who is being advised to go on strike by an 
organizer. The two characters may play the roles assigned and either 
state their own views or what they think the stereotype would do. 
For the home economics teacher, the situation posed might be a girl 
wearing a particular dress or outfit. Two or three girls might discuss 


employed by clinicians i 
on the classroom values 
nique, one, two, or more char 


370 Evaluating Major Objectives and. Situations 


their reactions or attitudes toward such clothing, etc. The possibilities 
are as endless as the practical imagination and skill of the teacher. 
The results of the sociodrama may be recorded in note, log, anecdote, 
stenographic report, and recording forms. Class discussion might be 
used to evaluate the sociodrama, though this may be difficult when 


students are not sufficiently mature to be able to discuss personal 
values. 


Controlled vs. Uncontrolled Observation The observation of be- 
havior to reveal attitudes and values may be controlled or non-con- 
trolled. In the controlled situation, the observer may have a checklist, 
rating scale, score card, questionnaire, inventory, or itemized guide 
to the writing of an anecdote, log, or essay. An uncontrolled observa- 
tion would be one in which the Observer receives no advance direc- 
tions about what to observe or how to observe. In this case, the report 
may take the form of verbal or written anecdotes, notes, essays, etc. 


PAPER-AND-PENCIL TECHNIQUES 


Paper-and-pencil techniques are the most fre 


quently used forms of 
attitude testing in the schools tod, 


a ay. The true-false or agree-disagree 
Inventory and the checklist are perhaps the most frequently en- 


countered approaches, although the use of rating scale, social-distance, 
and paired comparison methods are also utilized, Examples of the 


varied paper-and-pencil approaches which lend themselves most read- 
ily to teacher use are given below. 


city school system: 


The government should take 


i care of people who are too old 
or too sick to work. 


Public schools should provid 


€ free transportation for all stu- 
dents to and from school, 


Evaluating Attitudes and Values 371 


The government should step in and control prices when they 
get high. 

The government should guarantee jobs for people who are 
able to work. 

The government should not collect income taxes from anyone 
who makes less than $5000 a year. 


With more sophisticated respondents, it is possible to call for finer 
discrimination than that involved in simple agreement or disagree- 
ment. Reproduced below are the directions and a portion of a scale 
used to secure a measure of teacher attitude to the practice of organ- 
izing special classes for intellectually gifted children (IGC): 


Below are a series of 30 statements concerning IGC classes. Please 
circle the view which best expresses your own personal feelings, 
ie, STRONGLY AGREE (SA), AGREE (A), UNDECIDED 
(U), DISAGREE (D), or STRONGLY DISAGREE (SD). Be 
sure to circle one view in each of the 30 statements. 


1. It would be better practice to accelerate 


intellectually gifted children.......... SA A U D SD 
2. Parents of children in IGC classes try to 
interfere with the teacher's work...... SA A U D SD 


8. Children who are enrolled in IGC classes 
tend to become conceited about their 


&bilt8S: oy ope ko END dI an SA A U D SD 
4. The attitude in IGC classes is too com- 

VIII MERC ETSI TIL 11105 SA A U D SD 
5. Teachers of IGC classes get easier special 

assignments (such as yard duty, etc.)... SA A U D SD 
6. Children in IGC classes tend to be above 

average in social adjustment......... SA A UD SD 
7. IGC classes tend to neglect the funda- 

Don C APER TERT CERTOS He EO SA A U D SD 
8. You get more cooperation from parents of 

children enrolled in IGC classes....... SA A U D SD 


A numerical score may be obtained by assigning point values to 
each of the possible replies: SA = 5, A = 4, U = 8, D = 2, SD=1, 
when the statement is cast in such a way that the IGC class is placed 
in a favorable light. When the statement reflects discredit upon the 
special class, these point values would be reversed. 

When respondents are called upon to give a numerical rating which 
indicates relative acceptance or rejection of a given statement a direct 
rating scale approach is being utilized. The use of the rating tech- 
nique makes it mandatory to cast all statements in one form—all of 


372 Evaluating Major Objectives and Situations 


the statements must reflect acceptance or rejection of the concept, 
technique, or process toward which the respondent's attitude is be- 
ing evaluated. 


The Checklist Another modification of the agree-disagree inventory 
takes the form of a checklist. A typical measure of this type is repro- 
duced, in part, below: 


We would like to find out how you feel about the class you 
are now attending. Below are 20 statements about a class. 
You are to ask yourself the question, "Does this statement de- 
Scribe my class?" If the statement is a good description of 
your class, put a check () on the line in front of the state- 
ment. If the statement is a poor description of your class, put 
a cross (x) on the line. If you are not sure, put a question 
mark (?) on the line. 
1. Real friends are hard to find. 
. Almost everyone is ready and willing to work. 
———3. Almost all the pupils appreciate what you do for 
them. i 
——4. A good many pupils try to take advantage of you. 
——5. We really need a better classroom to do our best 
work. 
——— 86. Almost everyone minds his or her own business. 
—. You can really have a good time in this class. 


——5. This would be a good class if it weren't for one or 
two pupils. 


Here, too, simple acceptance or rejection of a descriptive statement 
may be looked upon as a relatively crude device, Attempts have been 
made to secure a finer measure of attitude by utilizing a form of check- 
list in which the responses to be checked may be considered indica- 
tive of two levels of acceptance or two levels of rejection. Items such 
as the following are characteristic of the approach used. (The reader 


will note the similarity to the Contemporary Problems Test described 
above.) 


A class is graduating in June and the girls have decided to 
wear long white dresses to graduation. Judy, who has been 
working after school taking care of younger children, says 
that her family needs the money she earns and that she will 
not waste it on a dress which she can wear only once. If you 
were in the class, would you: 


Evaluating Attitudes and Values 


—Ó' € 
e. 


gc 
=D, 


=E, 


vote to collect the money to buy Judy's dress? 


vote to have everyone wear very simple dresses 
which could be worn all summer? 


vote to have the graduation without Judy? 


vote to allow Judy to attend in any dress she 
wishes? 
PPP 


A certain class has planned to go on an all-day bus trip to a 


large city. 


A week before the trip, the teacher learns that the 


restaurants in the city do not serve Negroes. There are five 
Negro pupils in the class. If you were in the class, would you: 


———. 


B 


HG 


=D 


=F 


vote to have the whole class eat together in the 
bus? 

vote that the Negro children should eat in the bus 
while the rest of the class go to a restaurant? 
vote to ask the Negro children not to go on the 
trip? 

vote to visit another city where the whole class 
can go to a restaurant? 

PPP 


373 


Paired Comparisons The paired comparison approach is an out- 
growth of psychophysical research. The essence of the test is the mak- 
ing of comparative judgments between two objects. For example, the 
pupil may be presented with a list of school subjects or activities and 
asked to compare each subject or activity with every other in terms 
of a criterion such as, “which do you like better?” For example: 


A. Listen to the teacher read a story. 
B. Work with a group to build the scenery for a play. 


A. Work with a group to build the scenery for a play. 
B. Learn about how to make out a check. 


A. Learn about how to make out a check. 
B. Listen to the teacher read a story. 


The method of paired comparisons involves a considerable amount 
of statistical work in order to determine the respondent's relative ac- 
ceptance of the objects under consideration. Moreover, the range of 
attitudes to which this approach is applicable is limited. 


374 Evaluating Major Objectives and. Situations 


The SocialDistance Scale The social-distance technique, first used 
by Bogardus (1), is especially applicable in the evaluation of atti- 
tudes towards ethnic groups. As its name implies, the items on the 
scale are arranged to provide a measure of the closeness of relation- 
ship the respondent is willing to admit to members of a given group. 
A copy of a Bogardus-type scale used with college groups enrolled in 
courses in education and sociology is given below: 


SOCIAL-DISTANCE INVENTORY 


Directions: To which relationships are you willing to 


admit each of the 
following: 


Bur OR |INTER- CLOSE |NEIGH- TOLLON, 
NATION- MAR- FRIENDS| BORS WORK- 
ALITY RIAGE ERS 


CITI- 
ZEN- 
SHIP 


Armenians | 


Bulgarians 
Chinese 
English 


French 


Germans 
Greeks 
Hungarians 
Trish 


Italians 


Japanese 


The social-distance scale is easy to administer and score; interpre- 
tation may be difficult. Some persons who would avoid residing in a 
predominantly Armenian area would not object to associating with 
occasional Armenians in an informal social club. In terms of group 


rather than individual measurement, however, the social-distance scale 
may be used effectively to measure attitudes. 


Free Response Approaches Increasingly, 
free responses of students in essays may 
As a matter of fact, this concept lies at t 


it is being pointed out that 
reflect attitudes and values. 
he root of some of the pro- 


Evaluating Attitudes and Values 375 


cedures employed by Raths in his as yet unpublished value analysis 
work. For instance, one type of written exercise Raths has used profit- 
ably is to ask the student to write his reaction to a problem situation 
such as the following, which has been paraphrased: 

When the Art Museum purchased a large antique bronze work of 
art for $75,000, the city had people on breadlines. Do you think the 
city museum should have purchased this art piece when people were 
suffering.from hunger? Why? 

The paper is analyzed for positive and negative attitudes and values 
towards people, objects, and ideas. Some attention is paid to logic and 
consistency of value pattern. Similar analyses can be made of student 
autobiographies, diaries, letters, and logs of weekly activities as these 
are related to professed values. Free responses in the form of essays 
or lists from children on their wishes, their hopes, and how they would 
Spend a sum of money may also be used for gathering data on values. 


Summary 


Attitudes and values are important in education because they af- 
fect learning efficiency, reveal probable behavior, and guide people in 
their thinking. An attitude may be defined as a feeling or predisposi- 
tion to favor or to be against objects, ideas, persons, or groups. Values 
refer to those basic general principles which underlie a number of 
Specific attitudes. Attitudes and values are related to the attainment 
of educational objectives dealing with skills, information, and interests. 

Attitudes can be evaluated at various levels of the educational lad- 
der through the use of both standardized and teacher-made tech- 
niques, In general, the few standardized materials which are available 
have limited applicability in the classroom situation, in that their cur- 
ricular relevance is relatively low. The teacher, in developing his own 
instruments, may make use of both observational and paper-and-pencil 
approaches. Observation, which may be either controlled or uncon- 
trolled, is furthered by the use of recording devices such as anecdotal 
records, checklists, and logs. The presentation of various stimuli and 
of the sociodrama to elicit specific behavior for observational pur- 
poses has been found helpful. Paper-and-pencil techniques are per- 
haps the most frequently used forms of attitude testing. Agree-dis- 
agree inventories and checklists, rating scale, social-distance, paired 
comparison, and free response approaches have all been used with 
varying degrees of success. 


376 Evaluating Major Objectives and Situations 


Problems for Class Discussion 


1. Using the scale reproduced on page 371 as a guide, develop an instru- 
ment designed to measure the attitude of teachers toward the introduc- 
tion of a core curriculum in a high school. 

2. Arrange for the administration of a personality schedule and a scale 
measuring attitude toward minority groups to a high-school class. What 
attitudinal differences appear between well-adjusted and poorly-adjusted 
pupils? 

8. Draw up a series of situations which might be used as a basis for socio- 
dramatic study of the attitudes of junior-high-school pupils to authority. 


References Cited in This Chapter 
- Bogardus, Emory S., “A Social-Distance Scale,” Sociology and Social Re- 
search, 17:265-271, 1933. 


. Buros, Oscar K., editor, Fourth Mental Measurement Yearbook. Highland 
Park, N. J.: Gryphon Press, 1953. 


. Buros, Oscar K., editor, Third Mental Measurement Yearbook. New 
Brunswick, N. J.: Rutgers University Press, 1949. 


. Friedman, Bertha B., Foundations of the Measurement of Values. Teach- 
ers College Contribution to Education, No. 914, Bureau of Publications, 
Teachers College, Columbia Univ., New York, 1946. 


- Good, Carter V., editor, Dictionary of Education. New York: McGraw- 
Hill Book Company, 1945. 


. Smith, Eugene R., and Tyler, Ralph W., Appraising and Recording Stu- 
dent Progress. New York: Harper & Brothers, 1949. 


References for Further Reading 
Jahoda, Marie, Deutsch, Morton, and Cook, Stuart W., Research Methods in 
Social Relations. Volume I. New York: Dryden Press, 1951. 


Chapter VI contains a brief description of varying methods of evaluat- 
ing attitudes and presents some of the research evidence in the field. 


McNemar, Quinn, "Opinion-Attitude Methodology," Psychological Bulletin, 
43:289-874, July, 1946. i 


An excellent critical survey of techniques of evaluating attitudes. 


Remmers, H. H., Introduction to Opinion and Attitude Measurement. New 
York: Harper & Brothers, 1954. 


A complete discussion of the techniques used in measuring opinions 
and attitudes. 


Evaluating Thinking 
and Problem-Solving 


CHAPTER TWENTY 


The complete act of thinking, according to John 
Dewey (2), involves the task of finding meanings and trying them out 
in order to determine whether a particular course of action satisfac- 
torily solves a problem in accordance with social values. Considered in 
this light, thinking is the process of solving problems. The importance 
of thinking becomes quite clear when one remembers that human be- 
ings are continuously being called upon to solve their problems. Be- 
cause life is full of problems of varying complexity at varying stages 
of growth or chronological age, à preparation for life involves learn- 
ing skills in how to think. Since the function of the public school is to 
prepare children for living, t 

problem-solving situations. 
A problem-solving situation isa 

It consists of 
a. A problem (What should I do?) 
b. Courses of action (Which course of action should I take?) 
c. Limitations and assumptions (What conditions do I take 
for granted?) 

d. A set of values (Whi 


values?) : ; 
e. Reasons and consequences (What will happen if I do each 


of these things?) 


f. A solution and veri 
lem?) 


è It should be remembe 
in the thinking process. Facts 
possible courses of action avai 
the consequences, and the testi 


he curriculum should be replete with 


complex of a number of elements. 


at actions will be consistent with my 


fication (What best solves my prob- 


red that information plays a significant role 

which are relevant bear directly on the 

lable, the limitations and assumptions, 

ng of the solution. In this sense, facts 
377 


378 Evaluating Major Objectives and. Situations 


have a functional or meaningful context rather than an independent 
existence in memory based upon rote learning. It is true, therefore, 
that one cannot think without facts, but this should not be interpreted 
to mean that unless one learns facts first, one cannot think. The proc- 
ess of thinking compels one to gather facts, to apply facts and princi- 
ples, and to interpret facts or data. However, all of these should be 
part and parcel of the whole educational experience which accrues 
from problem-solving. In this way, facts take on meaning in the con- 
text of the thinking process. 

Problem-solving is related to such basic concepts as interests, needs, 
and experiences. À problem which arises for the student is disturb- 
ing to him because a solution is not at hand. Interest is stimulated 
when the student is in search of solutions to a problem. The need to 
be active and to have a variety of experiences is satisfied in the quest 
for courses of action which will solve the problem. The problem-solv- 
ing situation may include all of the basic elements entering the learn- 
ing process. The importance of thinking in education, therefore, lies 
in its fundamental relation to the process of solving problems. 

The present chapter considers the place of thinking and problem- 
solving in education and discusses techniques of evaluating them. 


The Place of Thinking and Problem-Solving in 
Education 


Since thinking is a very complex and important process, teachers 
should promote educational objectives encompassing this process. 
More specifically, the teacher, irrespective of grade level or subject 
matter, should diagnose and discourage evidences of poor thinking 
and encourage evidences and techniques of good thinking. 

In order to promote good thinking, the teacher should be familiar 
with those weaknesses which interfere with good thinking. An anal- 
ysis of these weaknesses will be made in terms of the six elements of 
the problem-solving situation given above. 


The Problem Poor thinking may arise from inability to see the real 
problem or from inability to state the problem. Such inabilities may 
arise from an ignorance of the facts making up the problem, confu- 
sions due to irrelevant considerations or questions, and the obscuring 
effects of prejudices upon clarity of vision. Statements made by stu- 
dents about what they think the problem is become the obvious raw 
material that the teacher needs to analyze. 


Evaluating Thinking and Problem-Solving 379 


Courses of Action In resolving a problem, the thinker is faced with 
various alternatives or courses of action. The types of errors a student 
might make include: 


(1) selecting courses of action irrelevant or inconsistent with 
his values 

(2) selecting courses of action which do not solve the real 
problem 

(8) neglecting limitations and assumptions in selecting 


courses of action 
(4) selecting courses of action which are narrow, limited, or 


biased 


Limitations and Assumptions All problem-situations contain limita- 
tions and assumptions, whether explicit or implicit. The types of er- 
rors students might make entail: 


(1) lack of awareness of limitations and assumptions 


(2) agreement with false assumptions 
- (8) failure to consider limitations and assumptions 


The Values When problems are being resolved, criteria for goodness 
(values) play a significant controlling role. The possible types of 
weakness in thinking related to values include: 


(1) failure to recognize that values apply 

(2) selection of irrelevant values 

(3) failure to see what values do apply to the problem 

(4) failure to apply explicitly any values in problem solution 
(5) use of false values to rationalize a solution 


Reasons and Consequences When people select a position or a course 
of action, it is assumed that they have reasons for their choice. These 
anticipated consequences of a 


reasons, in a sense, may represent 
and experience. Poor rea- 


course of action based on prior knowledge 
soning may involve the following weaknesses: 


(1) use of stereotypes 

(2) use of overgeneralizations 

(3) use of irrelevant reasons 

(4) use of inconsistent reasons 

(5) use of false authority 

(8) use of false analogies 

(7) use of incomplete reasons 

(8) restatement of assumption as a fact or reason 


380 Evaluating Major Objectives and Situations 


(9) use of false facts 
(10) statement of a false cause-effect relation 


Solution and Verification Each solution should be tested or verified 
and the data collected should be interpreted in connection with the 
problem studied. Such data should be complete, accurate, relevant, 
well organized, and well presented. Assuming such data, the student 
in attempting to interpret it may make errors of incompleteness or of 
incorrect interpretations. Of the latter, the causes may be attributed 
to errors of 


(1) extrapolation 

(2) interpolation 

(3) irrelevancy 

(4) false cause and effect relations 
(9) going beyond the data 


A more extensive discussion of errors in thinking may be found in 
books by Overstreet (5) and Thouless (7) 


Evaluating Thinking and Problem-Solving 
STANDARDIZED TESTS 


Standardized tests and teacher-made devices are the general means 
one may expect to employ in evaluation and measurement. Unfortu- 
nately, at the present time, there are few, if any, sufficiently valid and 
reliable measures of thinking, either standardized or teacher-made, to 
guide the interested teacher, Undoubtedly, lack of adequate means 
may be attributed not only to the relatively recent emphasis in the 
school on thinking as an objective, but also to the difficulties of encom- 
passing a valid and reliable measure of thinking in the time and cir- 
cumstances available for evaluation. One of the essential needs at the 
present time is the development of evaluation instruments to provide 
adequate appraisals of the thinking processes of children. 

The Watson-Glaser Critical Thinking Appraisal, published by World 
Book Company, Yonkers, N. Y., may be cited as typical of the stand- 
ardized measures presently available. The test, which consists of 99 
items, is divided into five subtests, each of which purports to measure 


different factors related to the total concept of critical thinking, as 
follows: 


a. Inference. (Twenty items.) Ability to discriminate among de- 


grees of truth or falsity or probability of inferences drawn from 
facts or data. 


Evaluating Thinking and Problem-Solving 381 


b. Recognition of Assumptions. (Sixteen items.) Ability to recog- 
nize unstated assumption in assertions or propositions. 

c. Deduction. (Twenty-five items.) Ability to reason deductively 
from given premises; to recognize the relation between proposi- 
tions; to determine whether what seems an implication or 
necessary inference between one proposition and another is 
indeed such. 

d. Interpretation. (Twenty-four items.) Ability to weigh evidence 
and to distinguish between unwarranted generalizations and 
probable inferences. 

e. Evaluation of Arguments. (Fourteen items.) Ability to distin- 
guish between arguments which are strong and important to 
the issue and those which are weak and unimportant or irrele- 


vant. 


A portion of the directions and a sample series of items drawn from 
the Interpretation subtest are reproduced below: 


Directions. Each exercise below consists of a short paragraph fol- 
lowed by several proposed conclusions. 

For the purpose of this test assume that everything in the short 
paragraph is true. The problem is to judge whether or not each of 
the proposed conclusions logically follows beyond a reasonable 
doubt from the information given in the paragraph. 


When Great Britain began to offer free public medical service, 
the government was surprised because far more people than they 
had expected came for eyeglasses and dental work. 


72. People who previously had neglected their eyes and teeth now 


chose to have such treatment. 
78. People who didn't really need. these services sought them be- 
cause they were free. . 
74. People in Great Britain previous 
the state of their eyes and teeth. 
75. The British public was pleased with the government health 


program. 


ly had been careless about 


sed with senior-high-school students. It provides 


The test may be u 
ights into his pupils’ think- 


the teacher with a means of developing ins 
ing processes and the basis for the pupils’ errors. 


TEACHER-MADE DEVICES 


Although the teacher ma 
sound a test as the “experts, 
value as a teaching device. In a 
both teacher and pupils clarify 
Critical thinking and problem-solving by 


y not be able to construct as technically 
" his product may be of considerable 
ddition, a teacher-made test may help 
their objectives in the development of 
making these objectives more 


382 Evaluating Major Objectives and Situations 


specific and concrete. While it is true that there are few available in- 
struments for measuring thinking, happily there are several Led 
approaches which have been utilized, and which teachers may modify 
for use in their own classrooms. 


Observational Procedures At the nursery-school, kindergarten, and 
lower-elementary-school level, one of the most valuable means of 
evaluating thinking is observation. In connection with this method, 
the teacher may record his data through the use of checklists, anec- 
dotal records, notes, logs, and schedules. The kind of material the 


teacher might collect on evidences of thinking at this elementary level 
are illustrated below. 


Louise P. Woodcock (8) reports various problems such youngsters 
meet and try to solve. One sequence of notes shows the progress of a 
child trying to place a chair next to a table: 


"Polly (2; 0) went to her own table where the chair stood in 
proper position except that it was shoved under the table a little 
too far. She took it by the corner of the back and drew it toward 
her, which opened a space, but not on her side of the chair. She 
pulled and stopped to look and pulled again, all the time widening 
the space at the far side of the chair but not helping her with her 


problem unless she should let go of the back and go around to 
that side. Did not solve the problem 


Landreth and Read (4) re 


port the following gem illustrating think- 
ing at the four-year-old level 


"At a nursery school lunch table, the question of babies’ teeth 
came up. Between mouthfuls of custard, John, aged three and a 
half, asked if babies had teeth. Custard eating was suspended 
while this problem received the consideration it deserved. Mary, 
with the conviction born of close observation of a two-month-old 
brother, said, ‘No, babies don't have teeth. Our baby doesn't 
have teeth.’ Here Martha, who has a nine-month-old sister at 
home, interrupted quickly, “My baby sister, Kathleen, has teeth.’ 
For a minute there was silence while Martha swallowed custard 
before enlarging on Kathleen's dentition. Four-year-old Dick's eye 
suddenly lighted, ‘I know, Martha,’ he said. ‘Only babies called 
Kathleen have teeth.’ Martha beamed, and the entire group re- 


turned to the custard, completely satisfied with this masterly sum- 
ming up of the situation. 


“Actually, Dick had done very well with the information at his 
disposal. Unfortunately for the validity of his conclusion, he had 
no knowledge of the relationship between age and dentition, The 
validity of scientists’ conclusions is similarly subject to their aware- 
ness of the factors involved. 


Evaluating Thinking and Problem-Solving 383 


“Dick’s teacher, though making no comment at the time, arranged 
for both babies to visit the school one morning so that all the 
children could see that babies are not just babies but likely to vary 
according to age." 


Susan Isaacs (3) reports many examples of young children thinking. 
She writes: 


"We avoided offering ready-made explanations to the children, 
not only because we did not want to foster verbalism, but also be- 
cause we did not want to substitute ourselves as authority for the 
children's own discovery and verification of the facts." 


She illustrates this approach with the following interesting account. 


a. “The rabbit had died in the night. Dan found it and said, “It’s 
dead—it's tummy does not move up and down now.’ Paul said, ‘My 
daddy says that if we put it into water, it will get alive again.’ Mrs. 
I. said, ‘Shall we do so and see?’ They put it into a bath of water. 
Some of them said, ‘It is alive.’ Duncan said, ‘If it floats, it’s dead, 
and if it sinks, it’s alive.’ It floated on the surface. One of them 
said, ‘It’s alive, because it’s moving.’ This was a circular move- 
ment, due to the currents in the water. Mrs. I. therefore put in a 
small stick, which also moved round and round, and they agreed 
that the stick was not alive. They then suggested burying the rab- 
bit, and all helped to dig a hole and bury it.” 

b. “The next day, Frank and Duncan talked of digging the rabbit 
up but Frank said, ‘It’s not there—it’s gone up to the sky,’ and 
gave up digging. Mrs. I. therefore said, ‘Shall we see if it’s there?” 
and also dug. They found the rabbit, and the children were very 


interested to see it still there.” 


Observational procedures, of course, need not be limited to the lower 
grades. Such approaches are also appropriate at the upper-elementary 
and junior-senior-high-school levels. The extent to which students 
sense a problem, know how to select, organize, and weigh evidence, 
and draw conclusions and test them in new situations may be revealed 
in student conversations, discussions, panel discussions, construction 
or painting activities, and reports. The teacher who can analyze such 
things as diaries, letters, essays, speeches, autobiographical statements, 
physical tasks involving problem solutions (fixing equipment; con- 
Structing stage sets; analyzing why something mechanical or physical 
is not working properly) and play or games may be in a position to 
round out a more complete picture of the thinking of a pupil. The 
teacher who is close to the student and can observe thinking at various 
times in various fields or areas is more likely to make a major contribu- 


tion to the child's thinking. 


384 Evaluating Major Objectives and Situations 


The Interview The doctoral study of Edwina Deans (1) may be 
cited as an example of the use of the interview in appraising the think- 
ing processes of children. Deans gathered data on the methods second- 
grade children used to solve problems of addition, subtraction, divi- 
sion, and multiplication. By asking pupils informally in interview 
fashion how they secured their answers she was able to diagnose their 
thinking processes and improve their arithmetical skills. For instance, 
the child who said that 5 4- 5 is 10 when asked about his answer might 
respond by counting his ten fingers one by one. The teacher, who feels 
that the child should be responding at a higher level (seeing 2 groups 
of 5 as 10), can, as a result of the interview, supply those experiences 
which will help the child think at a higher level. The interview method 
can be applied not only to arithmetic but to every other subject-matter 


field or problem situation. It is a valuable means of diagnostic and 
achievement testing. 


Pencil-and-Paper Approaches At the upper levels of the educational 
ladder, paper-and-pencil techniques may be added to the use of ob- 
servation and interview as appropriate means of evaluating aspects of 
thinking and problem-solving. Some of the more useful devices which 
have been developed are presented below. 

In a series of measures ! developed in connection with an appraisal 
of the effectiveness of newer elementary-school practices, Wrightstone 
(9) uses the following as one of a series of exercises designed to ap- 
praise pupil ability to obtain facts from given data: 


TABLE II Number of Cattle and Goats Raised in Four 
Countries—1935 


Number of Cattle Place Number of Goats 
202,355,000 India 47,000,000 
49,271,000 Russia 12,500,000 
30,868,000 Argentina 5,300,000 
18,918,000 Germany 2,800,000 


1 Test of Critical Thinking in the Social Studies, Form A. Bureau of Publications, 
Teachers College, Columbia Univ., 1942. 


Evaluating Thinking and Problem-Solving 385 


5. How many goats were raised in Russia in 1935? (1) 18,918,000 (2) 
49,271,000 (3) 12,500,000 (4) 47,000,000 

6. In what country were most cattle raised in 1935? (1) Russia (2) Argen- 
tina (3) Germany (4) India 

7. What country raised the least number of goats in 1935? (1) India (2) 
Russia (8) Argentina (4) Germany 

8. What country raised 5,300,000 goats in 1935? (1) Germany (2) Russia 
(8) Argentina (4) India 


p Another section of the test deals with pupil ability to draw conclu- 
sions from facts. In this instance, directions to the pupil and the test 
items take the following form: 

Mark with: (+) every statement which is true and can be proved 
by the facts stated 
(0) every statement which might be true but cannot 
be proved by the facts stated 
(—) every statement which is false as shown by the 
facts stated. 


III. Cost per Mile to Transport a Ton of Wheat by Different Methods 
Wagon 30 
Auto truck 15 


Railway 65/100 of 1 cent 


Ocean or river boat 1/100 of 1 cent. 


9. Wheat is usually delivered to markets by ocean freight c 2 
10. It costs more to transport wheat by land than by water 10.( J 
ll. The fastest way of transporting is by railroad ALE ) 

) 


12. A farmer would save money if he could ship his wheat by boat. .12.( 


A third section of the test, designed as a measure of pupil ability to 
apply general facts, uses the following approach: 


a number of paragraphs. Below each 


Directions: This section has 
about the paragraph. In the 


paragraph are two sets of statements 


386 Evaluating Major Objectives and Situations 


left-hand column are five statements. Three of these statements 
will help you to understand the three references in the right-hand 
column. Select a statement from the left-hand column which best 
explains a reference in the right-hand column. Write the number 
of the statement in the space after the reference. 


New York City is one of the largest cities in the world. 
Rents for houses or apartments are high in all parts of the 
city. Most of the people living in the part of the city called 
Harlem are Negroes. Many Germans live in the part called 
Yorkville, and many Italians live in the part of the city around 
Mulberry Street. Similarly, in Chicago, Boston, Philadelphia, 
and Los Angeles people of the same race or nation are likely 
to live in the same part of the city. 


1. A large city has many 1. Explains the presence of 
kinds of people. Negroes, Italians, and Ger- 
2. Big stores grow in big mans in New York City 


cities. Gu) 
8. People of the same race 2. Explains why rents are 
or nation stay together in high in New York City 
a big city. ( ) 
4. Rents are high in large 3. Explains why there is a 
cities. 


Hanen 252v. Go) 


5. Prices help to decide how 
much goods will be sold. 


Another type of item, designed to measure the pupil’s knowledge of 
facts as well as his ability to apply them to a simple problem, is illus- 
trated by the following exercise. The student is asked to check a given 
course of action and to indicate the reasons for his choice m 


Elizabeth is forty pounds overwei 
eight and feels that she i 
unattractive and unpopular because of it. She has decided sd 


reduce her food intake to s 
alad i i Vi 
coe rs e: s and liquids for several 


program of rci 
lose her excess weight. Would. you E TE 
———A. she may injure her heal 


^i s th by taking these 
B. she is wise to und 
ert i 
"n ind ake a reducing pro- 


1. Lots of exercise and limi 
© imited di r 
combination for reducin SNR SIRE SER, 
2. Dietetic authorities agree that you c i 
out any limitation of diet a rennes with: 


Evaluating Thinking and Problem-Solving 387 


— . 3. Physical educators endorse such a program as safe 
and healthy. 

— . 4. Elizabeth is probably overweight because she eats 
too much, so this plan would be safe and effective. 

— 5. Extreme limitation of diet over a long period of time 
can lower the body's resistance to disease and infec- 
tion. 

— — 6. Elizabeth's excess weight may be the result of a 
physical condition which would be made worse by 
hard exercise. 


The reasons given include false facts, argument by false authority, 
overgeneralizations, and stereotypes, in addition to a few good reasons. 
The test problems are used to locate evidences of poor (based upon 
above weaknesses) or good thinking (sound facts). Not only were 
students interested in taking this kind of test but they enjoyed discuss- 
ing the correct responses. As a teaching device to discover stereotyped 
attitudes and acceptance of false facts and to stimulate class discus- 
sion, this is a valuable approach. 

In the social studies field, a type of thinking and attitude test is 
illustrated by the "Government and Business" excerpt from an Evalu- 
ation of School Broadcasts’ Staff Test (now out of print) on the next 
page. This type of test item includes a paragraph of facts, the statement 
of the problem in terms of three opinions, and ten possible reasons for 
each of the three opinions. The student is asked to pick the opinion 
which is closest to his own and to indicate which of the ten reasons he 
thinks are valid to back up his opinions. The reasons include a pattern 
of four good reasons and six poor reasons (stereotypes, overgeneraliza- 
tions, and false authority). The three opinions were written in an 
attempt to reveal conservative, liberal, and radical attitudes toward 
government and business. Consistency, uncertainty, and "reasoning 
ability" can be appraised if enough worthwhile items are included. 

The Evaluation Staff of the Progressive Education Association in con- 
junction with the Eight-Year Study constructed a variety of tests which 
are designed to evaluate various aspects of thinking in the social and 
physical sciences (6). Illustrations from two of the most promising of 
these tests (now out of print) are also reproduced on the following 
pages. The first excerpt is from the Application of Principles Test (1.3) 
in the physical sciences, published by Educational Testing Service. 
The test contains a problem, alternative conclusions, and reasons for 
selecting any conclusion. 


388 


Evaluating Major Objectives and Situations 


GOVERNMENT AND BUSINESS 


Problem: What should be the relationship between government and 


business? 


OPINION A 


The federal government 
should not engage in 
any business which 
competes with private 
industry, but it should 
exercise such regula- 
tion as may be neces- 
sary in order to insure 
fair trade practices and 
fair prices. 


Reasons for Opinion A 


81. Real national pros- 
perity is possible 
only under a system 
of carefully regu- 
lated competition. 

82. The government 
should act as a ref- 
eree, not as an ac- 
tive player, in the 
game of business. 

83. Most leaders of big 
business are per- 
fectly willing for 
the government to 
exercise some reg- 
ulation so long as 
it stays out of com- 
petition with pri- 
vate business. 


OPINION B 
The federal government 
should extend public or 
cooperative ownership 
and management to all 
the basic industries of 
the nation. 


Reasons for Opinion B 


41. Under the present 
competitive system 
in industry, the 
stronger forces can 
easily take advan- 
tage of the weaker. 


42. The injustices 
which exist under 
private industry 
would disappear 


under a system of 
government owner- 
ship. 

48. Public ownership 
of basic industries 
is necessary for the 
conservation of na- 
tional resources. 


PROBLEM II 


OPINION C 

The federal government 
should allow business to 
solve its own problems 
and provide encourage- 
ment to businessmen to 
expand private indus- 
try. 


Reasons for Opinion C 


51. That government is 
best which governs 


least. 
52. Many of our lead- 
ing industrialists 


have pointed out 
that proved ways 
of managing busi- 
ness are more to be 
trusted than ideal- 
istic schemes of 
government dicta- 
tion. 

58. When business is 
allowed to go its 
own way, prices 
and wages should 
be determined on 
the basis of free 
competition. 


Water is being poured into a tall slender jar. What will happen to 
the pitch of the sound coming from the jar while it is filling up? 


Directions: Choose the conclusion which you believe is most con- 
sistent with the facts given above and most reasonable in the light 
of whatever knowledge you may have, and mark the appropriate 


space on the Answer Sheet under Problem II. 


Evaluating Thinking and Problem-Solving 389 


Conclusions. 
A. The pitch will remain the same. 
B. The pitch will become higher. 
C. The pitch will become lower. 


Directions: Choose the reasons you would use to explain or sup- 
port your conclusion and fill in the appropriate spaces on your 
Answer Sheet. 
Reasons. 

1. The pitch of a sound is synonymous with its frequency. 

2. The frequency of a sound is directly proportional to its wave 

length. 
8. The vibration of the jar depends only upon its dimensions. 


4. The less the wave length of a sound wave, the greater is its 


frequency. 
5. The volume of the water is gradually being increased. 


6. The air column at resonance is one-fourth of a wave length 
long. 

7. Disturbances in the air are called sound. 

8. The wave length of the sound from a vibrating water column 
depends directly upon its length. 


The Interpretation of Data Test, published by the Educational Test- 
ing Service, was designed to appraise general accuracy, accuracy with 
"probably" statements, accuracy with "insufficient data" statements, 
accuracy with "true" and "false" statements, caution in understatement, 
going beyond the facts, and crude errors. The following excerpt from 
an early form of the printed test asks the pupil to indicate which of 
the following responses to individual statements is applicable: 


(1) the evidence is sufficient to make the statement true. 

(2) the evidence suggests that the statement is probably true. 

(8) the evidence is insufficient to make a decision concerning the 
statement. 

(4) the evidence suggests that the statement is probably false. 

(5) the evidence is sufficient to make the statement false. 


The test item is given below: 


Observations on many cases of the effect of carbon monoxide in 
the atmosphere and the concentration in the blood of adult human 


390 Evaluating Major Objectives and Situations 
beings have been made. The following graph gives the average 
values of the concentration of carbon monoxide in the atmosphere, 


the per cent of the possible blood saturation and the most probable 
effect it produces on adult people. 


End MUR IE ET MeSH DEATH A. 
5 60 
Ei 
Bs 5 
ok 
a z 50 UNCONSCIOUSNESS 
ORI? E NE, ER Sock inte Rn AEN DIE e aA eS 
CEU roce RA DR ON NP. 
3 3 40 
BB ss 
6a 30 
Za HEADACHE 
Sco Poles aA CMT E a S E AT AE A 
à "py 
oo 
B 2 
[^ 10 


HARMLESS 


5 10 15 20 25 30 35 40 45 50 55 60 65 70 15 80 85 90 


Carbon Monoxide in the Atmosphere - Parts per 100,000 
Parts of Air 


STATEMENTS 


. In adult human beings, the carbon monoxide content of the 
blood increases with its increasing concentration in the atmos- 
phere. 

. Death never occurs in human beings when exposed to an at- 
mosphere which contains 75 parts of carbon monoxide to 
100,000 parts of air. 

. People who are found unconscious in garages are usually suf- 
fering from carbon monoxide poisoning. 

. In our present complex civilization, we must expect that a 
large number of people will suffer from carbon monoxide 
poisoning. 

- Doubling the concentration. of .carbon monoxide in the air 
doubles its content in the blood of adult human beings. 

11. If Mary Smith, a healthy 22 year old office worker, were ex- 

posed for some time to-an atmosphere which contained 60 


parts of carbon monoxide per 100,000 parts of air she would 
lose consciousness. : 


This brief survey of various types, of. paper-and-pencil test. items 
should not be looked upon as a definitive analysis of available ap- 


Evaluating Thinking and Problem-Solving 391 


proaches. The resourceful teacher will find it possible to modify the 
techniques suggested here and to construct other types of items more 
suitable to his age and grade groups. Moreover, the teacher should 
bear in mind that paper-and-pencil approaches should be supple- 
mented by other means of evaluating the thinking of his pupils. To 
the degree that teachers can secure evidence of the ability their pupils 
display in solving problems in a wide variety of school and non-school 
situations, there is greater assurance that valid measures of true ability 
are being obtained. 


Summary 


Thinking may be defined as the finding and testing of meanings. It 
involves a process consisting of six aspects: (1) a problem, (2) courses 
of action, (3) limitations and assumptions, (4) a set of guiding values, 
(5) reasons and consequences, and (6) a solution and its verification. 
Thinking, when understood as the process of solving problems, is 
important because life is full of problems which demand intelligent 
solutions. The promotion of good thinking and the diagnosis and dis- 
couraging of poor thinking should be the main goals of educators in 
this area of school objectives. There are few standardized tests avail- 
able for evaluating pupil thinking and problem-solving ability. Ob- 
servational and interview procedures are useful in the lower grades; 
paper-and-pencil techniques may also be used in the upper grades. A 
comprehensive program of evaluation would entail a variety of meas- 
ures, used in both school and out-of-school situations. 


Problems for Class Discussion 


1. Select an editorial dealing with a controversial issue from your local 
newspaper, and present it to a junior- or senior-high-school social studies 
class. Obtain from the group their reasons for accepting or rejecting the 
point of view expressed. Analyze these reasons to determine what evi- 
dences of good or poor thinking are shown. 

2. Using another editorial or article drawn from a similar source, ask the 
group to underline all emotionally-toned words or phrases which are 
used by the writer. Set up some method of arriving at a quantitative 
measure of the ability of each pupil to recognize the appeal to emotion. 
To what extent do the more intelligent and less intelligent members of 
the group show differences in ability? 

8. Observe a nursery-school group in action for a full day. Keep anecdotal 
records of the nature of the problems met by individual children and 
of the techniques used in solving these problems. 


392 Evaluating Major Objectives and. Situations 


References Cited in This Chapter 


l. Deans, Edwina, The Effect of Certain Immature Procedures on the 
Learning Process by Second Grade Children. Unpublished Ph.D. thesis, 
University of Cincinnati, 1950. 
. Dewey, John, How We Think. Boston: D. C. Heath and Company, 1933. 
8. Isaacs, Susan, The Experimental Construction of an Environment Opti- 
mal for Mental Growth, in Murchison, Carl, editor, A Handbook of 
Child Psychology. Worcester: Clark University Press, 1931. 

4, Landreth, Catherine, and Read, Katherine H., Education of the Young 
Child. New York: John Wiley and Sons, 1942. 

5. Overstreet, Harry A., The Mature Mind. New York: W. W. Norton and 
Company, 1949. 

6. Smith, Eugene R., and Tyler, Ralph W., Appraising and Recording 
Student Progress. New York: Harper & Brothers, 1942. 


7. Thouless, R. H., How to Think Straight. New York: Simon and Schuster, 
1939. 


8. Woodcock, Louise P., Life and Ways of the Two-Year-Old. New York: 
E. P. Dutton and Co., 1941. 
. Wrightstone, J. Wayne, Appraisal of Newer Elementary School Prac- 


tices. New York: Bureau of Publications, Teachers College, Columbia 
Univ., 1938. 


to 


References for Further Reading 


Dewey, John, How We Think. Boston: D. C. Heath and Co., 1933. 


A basic description of the nature of the thinking process. While the 
authors style makes for difficulty, careful reading of the analysis of the 
steps involved in thinking will be rewarding. 


Raths, Louis E., Appraising Certain Aspects of Student Achievement. 
Thirty-seventh Yearbook of the National Society for the Study of Educa- 
tion, Part I. Bloomington: Public School Publishing Company, 1938. 


Contains descriptions of a variety of techniques for evaluating pupil 
ability in the area discussed in this chapter. 


Smith, Eugene R., and Tyler, Ralph W., Appraising and Recording Student 
Progress. New York: Harper & Brothers, 1942. 


Chapter II describes the instruments devised by the Evaluation Staff 
of the Eight-Year Study conducted by the Progressive Education Asso- 
ciation to measure aspects of thinking. 


Evaluating Health 
and Physical Development 


CHAPTER TWENTY-ONE 


The development of optimal physical health is a goal 
which is accepted by all segments of society. Because of the im- 
portance which society attaches to this goal, it is not surprising that 
attempts have been made to legislate desirable practices in both the 
school and the community. 


DIRECT HEALTH INSTRUCTION 


Many states by legal statute require that health instruction be pro- 
vided in the elementary and junior-high schools. This kind of legal 
requirement and the increasing sensitivity of teachers to the signifi- 
cance of the maintenance of physical health as an educational objective 
have resulted in the development of programs of health instruction 
which include imparting information, stimulating interests, molding 
attitudes, and encouraging good personal health practices. The re- 
sponsibility for giving such instruction falls most directly on the 
teacher. Depending upon the curriculum organization, such instruc- 
tion may be provided directly by a health or physical education in- 
Structor or as a part of instruction in science, home economics, social 
studies, etc. The function of the teacher is primarily that of bringing 
about those changes in pupil behavior which make for the main- 
tenance and improvement of health by the pupil himself. 


SCHOOL POLICIES AND PHYSICAL 
FACILITIES RELATED TO HEALTH 

Paralleling the legal statutes requiring health instruction are statutes 
stipulating minimum standards concerning building construction, 
safety, and inspection procedures. In practice, such minimum standards 


are often exceeded through pressures exerted by alert and sensitive 
393 


394 Evaluating Major Objectives and Situations 


Boards of Education, administrative officers, and parent-teacher as- 
sociations and similar organizations. Procedures and policies set by 
law apply usually to traffic conditions (safety to and from school). 
practices in the school gymnasium and cafeteria, heating plants and 
electrical systems, window arrangement, and ventilation, as well as to 
direct health inspection by the nurse or teacher. The responsibility for 
school policies and maintenance of physical equipment conducive to 
the improvement of health generally falls directly upon the administra- 
tive officer, who in turn usually shares this responsibility with desig- 
nated members of the school staff by giving certain kinds of authority 


to such persons as the school nurse, the physical-education teacher, 
the classroom teacher. 


HOME AND COMMUNITY PRACTICES RELATED TO HEALTH 


A third major group of practices and activities which have a direct 
bearing on physical health are those occurring in the home and in 
other aspects of the “non-school” environment. Here again many prac- 
tices and building and equipment specifications are "spelled out" in 
state and municipal law. Often, however, in 
for determining whether the minimum standards specified by law are 
being followed. Too often, it takes major catastrophes to arouse the 
citizenry of a community to the point where they insist on system 
inspection to insure that the laws are being 
for the maintenance of valid home and community procedures affect- 
ing health is unfortunately divided among various individuals and 
agencies. The local and state Board of Health, the municipal engineer, 
state architect, local and state police, and other agencies deal in one 
way or another with major problems of home and 
involving physical health, 


adequate provision is made 


atic 
met. The responsibility 


community practices 


The Objectives of Health Instruction 


A comprehensive list of objectives of health instruction has been 
compiled by Mabel E. Rugen and Dorothy Nyswander (6). Selections 
from this list are reproduced here as a basis for illustrating the nature 
and scope of health instruction as it is perceived today and as a basis 
for determining appropriate methods of evaluation. 


1 A detailed summary of various responsibilities and functions assumed by different 
people and organizations is presented in Delbert Oberteuffer, School Health 
Education, New York: Harper & Bros., 1949, Chapter 16. 


Evaluating Health and Physical Development 395 


OBJECTIVES OF HEALTH EDUCATION 
With Respect to Individual Health 


1. To know that every individual has a definite responsibility for 
his own health and must accept this responsibility if he would 
achieve his aims most fully, and that the way he lives is important 
to his personal health and to the attainment of his life goals. He 
should, however, be aware of the fact that health is not a responsi- 
bility he carries by himself as an individual alone, but one which 
he shares with all members of his social groups. 


4. To know the relation of adequate sleep and rest to the physi- 
ology of tissue functioning and fatigue, and to accept the fact that 
adequate sleep and rest each day is essential for counteracting the 
undesirable effects of fatigue and for promoting normal growth. 


5. To know the importance of a quiet and restful environment in 
contributing to the best conditions for relaxation, rest, and sleep, 
and to know how to provide such an environment. 


6. To know how to participate in play and exercise and to accept 
the fact that vigorous play and exercise, out-of-doors when pos- 
sible, are important for good bodily function, and that the kind 
and amount of exercise desirable for different age and sex groups 
vary with the individual. 


7. To know that it is through suitable strenuous physical-education 
activities, exercise, and play, that one acquires endurance, strength, 
stamina, i.e., the physical power to achieve difficult tasks, and to 
know the relationship of this ability to physical fitness and the 
factors that contribute to total body fitness. 


10. To know how the various kinds of food function in the human 
body, how to obtain these foods, and how to improve his own 
habits of eating and food selection, and to accept the fact that a 
well-balanced daily diet is essential for his best growth and health. 


18. To know the need for and the essentials of a good periodic 
appraisal of one's own health status, and to understand what the 
individual, teacher, parent, and health specialist can do in making 
such an appraisal. 


With Respect to Family Health 


28. To accept the fact that the health of a family is dependent on 
the healthful or unhealthful living of each individual composing 
the family. 


29. To know the family health services available through the local 
health department, visiting nurse association, hospitals, private 


396 Evaluating Major Objectives and Situations 


physicians and dentists, or other health agencies or personnel 
in the community, as well as the desirable procedures for selecting, 
obtaining, and using these services. 


82. To know how to improve environmental health conditions 
with reference to light, heat, ventilation, and sanitation within the 
home and on the home premises, and to accept a share of the re- 
sponsibility for making these improvements. 


88. To know the importance of checking consumer advertising of 
medicine, foods, and treatments before purchasing the product for 
use by any members of the family. 


With Respect to Community Health 


36. To accept the facts that public health laws, procedures, and 
programs are designed to protect the health of citizens in the com- 
munity and that individuals have a responsibility to co-operate in 
the effort to improve the health program for the total community. 


40. To know how to make his community safe with reference to 


the prevention of accidents due to traffic, fire, drowning, or other 
hazards. 


Evaluation Techniques for 
Various Health Objectives 


The techniques employed for the collection of evidence concerning 
the achievement of improvement in health instruction, policies, and 
practices vary not only with the age of the children involved, but 
with the phase of health under consideration. In general, such tech- 
niques include observation, physical measurements of growth and 
development, tests of motor and sensorimotor proficiency, 
and-pencil tests and questionnaires inquiring into health a 
terests, and practices as well as pupil knowledge of health 


and paper- 
ttitudes, in- 


Teacher Observation The classroom teacher n 
position in educating pupils in the principles 
he serves as the starting point for a comprehe 
ation. The time that the child spends in sc 
continuous observation. The teacher can rea 
repeatedly fails to respond to a spoken re 
strains to read the blackboard and holds his book very close or who 
holds his head on one side while reading. The confirmed mouth- 
breather, the apathetic and listless child, 


and the pupil who shows 
disregard for cleanliness and the spread of infection vd be readily 


ot only fills a strategic 
of healthful living, but 
nsive program of evalu- 
hool offers a period for 
dily note the child who 
quest, or the pupil who 


Evaluating Health and Physical Development 397 


identified. During the course of the school day, the classroom teacher 
is in an excellent position for identifying physical difficulties and for 
referring the child for the special help which may be indicated. 

In many school systems, the classroom teacher has also been charged 
with the responsibility of conducting a more formal approach to the 
detection of visual and auditory defects. The most commonly used 
method of evaluating pupil vision makes use of a Snellen-type chart. 
The child is asked to identify letters on lines of successively smaller 
size type from a distance of twenty feet. Each eye is tested separately. 
Normal vision is represented as a score of 20/20. Scores of 20/30, 
20/40, etc., represent defective acuity, in that the child can see at 20 
feet letters which the normal eye can perceive at 30 feet, 40 feet, etc. 

The use of a Snellen-type chart as the sole screening technique has 
many limitations. Used in group situations, many children are able 
to mask their poor vision by memorizing the chart. Moreover, even 
when used as an individual test, the one defect the chart discloses is 
nearsightedness. It does not detect moderate degrees of farsightedness 
or astigmatism, nor does it offer any means for locating children who 
show severe cases of poor fusion or eye muscle imbalance. 

The Eames Eye Test, available from the World Book Company, 
Yonkers, N. Y., provides an inexpensive series of tests which go be- 
yond the conventional Snellen-type chart to provide measures of 
farsightedness, astigmatism, coordination, fusion, and eye dominance. 
The test is simple to administer and interpret, and represents an ex- 
cellent tool for teacher use in disclosing difficulties which require 
professional attention and possible treatment. 

The most accurate way of measuring pupil hearing in the school sit- 
uation is through the use of an audiometer? As many as forty chil- 
dren can be tested at one time. The child listens through an earphone 
to a phonograph recording and writes down the numbers which he 
hears. Since the numbers are spoken at different degrees of loudness, 
an estimate of hearing loss can be made. An individual audiometric 
test, of course, provides much finer discrimination, in that amount of 
hearing loss for tones of high, medium, and low pitch can be deter- 
mined. 

When audiometers are not available, the teacher must make use of 
relatively crude techniques. The whisper test and the watch test, both 
of which have many limitations, may be used by teachers. The major 


2 Audiometers suitable for use in schools are manufactured by the Western Electric 
Company and the Maico Company, Inc., Minneapolis. Models for both group and 
individual testing are available. 


398 Evaluating Major Objectives and Situations 


difficulty encountered in using these tests is the inability to determine 
what constitutes normal hearing in the room in which the test is given. 


Measurements of Growth and Development Perhaps the only cri- 
teria readily available to the teacher for judging the physical status 
and growth of the child are repeated measurements of the child's 
height and weight. An evaluation of developmental status and growth 
may be made by comparing the child's height and weight with norms 
which represent the "average" child and by comparing the child's 
gains in height and weight to that of other children. Standard height- 
weight tables for the school age child which may be used include the 
Baldwin-Wood Tables (1), and the norms developed by the Fels In- 
stitute (7) and the Brush Foundation (5). The Pryor Tables (4) are 
somewhat more comprehensive, in that they include chest and hip 
measurements, and thus consider body build in arriving at an estimate 
of the child's status. 

The use of standard tables has several dis 
guard against the typical tendency to reg 
an interpretation which fails to make allowance for individual differ- 
ences. Caution must be observed, too, in evaluating gains. Research 
has demonstrated that children vary considerably in rate of growth at 
similar ages. Such differences in rate of growth must be considered if 
an accurate estimate of an individual child's progress is to be made. 

The most satisfactory method of evaluating the child's growth and 
development follows his progress from year to year, and interprets his 
progress in the light of his physical constitution. Several methods are 
available for charting the child's growth. The Wetzel Grid (8) may be 
cited as an example of the approach used. 

When weight is plotted 


first panel of the Wetzel G 


advantages. One must 
ard the norm as the ideal, 


cessive measurements reveals the directi 
ing. Since healthy development contin 
in the child's nutritional or health status is indicated if the child moves 
out of his established channel. If the curve moves to the left, à change 
in the direction of a. more stocky. physique, leading ultimately to 
obesity, is evidenced; if the curve moves to the right, a change to a 
more slender physique and a loss in nutrition 


i al status is involved. 
Crossing the channels at regular intervals are a series of lines which 


on in which the child is grow- 
ues along one channel, a change 


Evaluating Health and Physical Development 399 


provide a measure of level of development. The developmental level 
the child has reached is plotted against chronological age on the sec- 
ond panel of the grid. The resulting curve, or auxochrome, is a meas- 
ure of the child’s speed of development. A set of standard auxo- 
chromes, which indicate how many children at a given age (expressed 
as a fraction of the total number of children at that age in the gen- 
eral population) may have: been expected to have reached a given 
developmental level, are provided. It is thus possible for the person 
using the grid to determine whether a child’s developmental level is 
low or high in relation to the normal distribution of size for age. 

The teacher can easily learn to plot height and weight measures on 
the grid, and to determine the child’s pattern of development. The 
device serves a very important function in screening out children for 
needed medical attention. 


Tests of Motor and Sensorimotor Proficiency In general, evaluation 
in this area is not the province of the classroom teacher, and is re- 
stricted to the specialist in health education. Measurement of motor 
ability, measures of physical fitness, tests of physical skills and ath- 
letic ability involve the administration of individual performance tests 
which call for added training and experience on the part of the 
teacher. A description of such tests has therefore been considered be- 
yond the scope of this volume. The interested reader is referred to 
the comprehensive texts in the field by McCloy (3) and by Bovard 
and others (2). 


Paper-and-Pencil Tests and Questionnaires One common classifica- 
tion of instruction objectives is by health practices, information, inter- 
ests, and attitudes. Such a classification saves time in the collection 
of evidence—but one must not assume by such a classification that 
health attitudes are something unrelated to health information, for 
example. Actually interrelationships exist between the various parts of 
the classification; such interrelationships should be recognized in plan- 
ning the use of specific evaluation devices and in interpreting results. 


Health Practices The most frequently used device for obtaining in- 
formation concerning the pupil’s health practices is the health prac- 
tice inventory. This procedure is employed in the construction of the 
Johns Health Inventory (Stanford University Press, 1948), which 
samples the pupil’s health practices in the following areas: nutrition, 
excretion, exercise, posture, defenses against communicable diseases, 


400 Evaluating Major Objectives and Situations 


defenses against non-communicable diseases, defenses against acci- 
dents, defenses against habit-forming substances, use of scientific 
services and facilities, and evaluation of health information. In his 
inventory, Johns lists specific practices, e.g., "take plenty of time in 
eating your meals" and the pupil is asked whether he engages in the 
practice "never," "rarely," "sometimes," "usually," or "always. _ 

Other health practice inventories have been developed along simi- 
lar lines. The variations in the technique are primarily in the catego- 
ries of types of health practices included and in the designation of the 
“frequency” response to be checked by students. The health practices 
inventory (Inventory 1, Health Activities ), developed by the Cooper- 
ative Study in General Education and published by the Ed 
Testing Service, has practices classified into four 
appearance and hygiene, diet and nutrition, general operations of the 
body, and health hazards. The pupil is asked to indicate whether he 


“never,” “occasionally,” or “regularly” engages in each activity listed. 
This instrument is now out of print. 


The validity of such inventories f 
pupils engage in good he 
tices is largely depende 
used. If the results of t 


ucational 
categories: personal 


or assessing the extent to which 
alth practices and avoid unacceptable prac- 
nt upon the atmosphere in which they are 
he inventory are to be used in assignment of 
grades, or even if they are to be used in a pupil-teacher interview in 
which the pupil may be reprimanded for his “bad” practices, the va- 
lidity will probably be low. On the other hand, if the inventory is used 
for evaluation of a school health p 
identified and if, in the "testing" sit 
can be enlisted, the probabilit 

The use of the health 
extended to pupil self- 
only one who "sees" th 
and suggestions, includ 
his results. He is also 


ability of validity. This approach has been used 
ventories dealing with 


Health Information The appraisal of the possession of health in- 
formation generally involves two kinds of devices: the essay examina- 
tion and the objective (short answer) 


t type. Essay examinations are 
most often used by teachers in the normal course of instruction. Ob- 


Evaluating Health and Physical Development 


Illustrative 
TABLE 24 


Tests in Health Areas 


DATE 
1940-1941 


TEST AND PUBLISHER 


Byrd Health Attitude 
Scales 
(Stanford University 
Press) 


Gates-Strang Health 
Knowledge Tests 
(Teachers College, 
Columbia Univ., 
Bureau of Publications) 


Health Awareness Test 
(Teachers College, 
Columbia Univ., 
Bureau of Publications) 


Health Education 
Tests: Knowledge and 
Application 

(Acorn Publishing Co.) 


Health Inventory for 
High School Students 
(California Test Bureau) 


1937 


1937 


1946-1947 


1942 


Johns Health Practice 1943 
Inventory 
(Stanford University 


Press) 


National Achievement 1949 
Test: Health Test 


(Acorn Publishing Co.) 


State High School 1945-1947 
Tests for Indiana: Health 

and Safety Education 

(Purdue University) 


Trusler-Arnett Health 
Knowledge Test 
(Bureau of Educational 
Measurement, Kansas 
State Teachers College) 


1940 


CONTENT 
Health attitudes 


Health information 


Health information 


Health knowledge: appli- 
cation of knowledge 


Health practices, informa- 
tion, interests, attitudes, 
analyzing health problems, 
and sources of health in- 
formation 


Health practices 


Health information; health 
practices 


Health information; appli- 
cation of information 


Health information 


GRADE 
10-14 


3-8, 
7-12 


5-8 


7-16 


10-16 


7-14 


7-12 


9-16 


402 Evaluating Major Objectives and. Situations 


jective type examinations are usually used to appraise the increase of 
health information over an extended period—that is, a semester or a 
year. l 

Objective type examinations dealing with the possession of health 
information usually employ the same techniques as those employed 
for appraising the possession of other kinds of information. Thus, de- 
pending upon the particular bias or interest of the test constructor, a 
health examination of the objective type may consist of true-false, 
multiple choice, matching, or even completion items. To some extent, 
of course, the nature of any item will depend upon the nature of the 
health information to be tested. 

A variation of the usual “true-false” test is employed in a health in- 
formation inventory developed by the Cooperative Study in General 
Education, now out of print? The following directions 
beginning of a series of 110 statements: "The purpose of this inventory 
is to determine the extent of your knowledge of certain kinds of health 
information. Read each of the following statements carefully and then 
on the answer sheet which has been given to you, blacken the space in 


are given at the 


Column 1—If you believe the statement to be true to the extent 
that you would not hesitate to act on the basis of its 
implications. 

Column 2—If you believe the statement is 
more true than false. 

Column 3—If you are completely uncertain about the statement, 
that is, if you cannot decide as to its truth or falsity. 

Column 4—If you believe the statement to be probably false, that 
is, more false than true. 

Column 5—If you believe the statement to be false to the extent 


that you would reject any action based on its implica- 
tions. 


probably true, that is, 


In this inventory some of the items are keyed as uncert: 
is, the correct answer is uncertainty and the student s 
that he cannot decide as to its truth or falsity. For the 
an item or a statement is true then either of th 
or 2 is keyed as correct. Similarly, if a statement is false, then a re- 


sponse in Column 4 or 5 is keyed as correct. The additional informa- 
tion obtained by this method of testing is a “certainty” score which is 


represented by the number of “1” and “5” statements indicated by the 
student. In this way it is also possible to identify the difference be- 


3 Cooperative Study of General Education: Inventory 2, Health T 
Princeton, N. J.: Educational Testing Service, 1950, ealth Information. 


ain items—that 
hould indicate 
other items, if 
€ responses in Column 1 


Evaluating Health and Physical Development 403 


tween the correctness of the student’s response and the certainty 
which he feels concerning the correctness of his information. 

Objective type examinations in the field of health information now 
on the market do not include any generally accepted content or cate- 
gories of content. Thus, one health information inventory is “struc- 
tured” in terms of five major categories of health problems: 


Personal appearance, hygiene, comfort. 

Diet, nutrition, elimination. 

Operation of the body. 

. Organic, physical, and chemical health hazards. 
Reproduction and heredity. 


QUE co to c 


The 110 items in the inventory are approximately evenly divided 
among these five major topics. Other health information objective type 
examinations contain items which sample the general field of health 
information in different ways. 

In objective type health examinations the individual items generally 
represent specific elements of health knowledge. For the most part, 
the information called for requires the reproduction of one or the 
other of two kinds of "facts." The larger group of such facts usually 
represents specific information obtained from physiological research. 
Generally, in an objective type test, relatively few items call for the 
formulation of predictions on the basis of general principles. 

Rather often in objective type health information tests, individual 
items reflect what might be called health attitudes (^you should see 
your doctor once a year") or they represent items containing a mixture 
of fact and opinion or attitude. In order to reach clear-cut evaluations 
of the extent of achievement in the health objective, some of the 
more recent evaluation devices make fairly clear-cut distinctions be- 
tween health information and attitudes. 


Health Interests The evaluation and appraisal of health interests is 
based on two assumptions: first, that instruction in health or health 
problems should result in a greater interest or concern on the part 
of pupils, and, second, that knowledge of health problems or situa- 
tions of interest or concern to pupils in particular grades and in par- 
ticular parts of the country is helpful for the purpose of developing ap- 
propriate curricula in health instruction. The general technique em- 
ployed in a systematic evaluation of health interests is to list ques- 
tions about health or activities pertaining to health and to ask the 
pupils whether such questions or activities are of interest to them. 


404 Evaluating Major Objectives and Situations 


The activity technique was used in surveying the health interests of 
Denver school children. The inventory used listed 250 specific activi- 
ties and asked the student to indicate whether the activity is some- 
thing he would like to do, would not like to do, or about which he 
was uncertain. This health interest inventory of activities sampled 
eighteen areas of health: keeping physically fit, group health, cause 
of disease, protection from disease, structure and function of the body, 
dental health, good eating habits, selection and composition of food, 
stimulants and narcotics, rest and relaxation, personal appearance, per- 
sonality development, social health, heredity and eugenics, first aid, 
home nursing, safety, and vocations and health. 

A health interest inventory developed by the Cooperative Study in 
General Education and used in secondary schools and colleges (In- 
ventory 3, Health Interests. Educational Testing Service, 1950, out of 
print) lists a series of specific questions, e.g., "How can athlete's foot 
be cured?" The student is to indicate whether 


l. The question is interesting and should be dealt with in 
school. 


2. The question is interesting but should not be dealt with 
in school. 


8. The question is not interesting. 


This inventory consists of 129 items approximately equally distrib- 
uted over the same areas of health problems as those indicated in the 
above section on health information. 


Health Attitudes The evaluation of health attitudes is based upon 
essentially the same assumption as that employed in the evaluation 
of other kinds of attitudes, namely, that it is possible to estimate a 
person's predisposition to act in certain ways in connection with 
health situations or problems. A number of people responsible for 
health instruction insist that the predisposition to act should be in- 
ferred from a sample of the individual's actual behavior. These health 
educators would say, "I will infer the presence or absence of good 
health attitudes from anecdotes of specific action in health problems." 
Other health educators, although they agree that the more valid 
measure of the attitude is revealed by an individual's action, are will- 
ing to accept verbalization concerning health action by the individ- 
ual as predictors of what would be done by the individual in a health 
problem. This willingness results partially from the difficulty of ob- 
taining anecdotal evidence concerning an individual's participation 


Evaluating Health and Physical Development 405 


or action in health problems. Because of this difficulty the “attitude 
scale” technique has been employed to obtain estimates of health at- 
titudes. 

The “attitude scale” technique usually consists of obtaining student 
or pupil reactions to a series of declarative statements. Generally, in 
this technique, the pupil is informed that there are no right or wrong 
answers and he is asked to indicate with which of the statements he 
agrees, with which he disagrees, and about which he is uncertain. 
Obviously, as in the case of devices for evaluating the presence of 
desirable health practices, such attitude scales are most likely to be 
valid when used in an atmosphere in which there is neither reward 
nor penalty associated with the pupil's response. 

One of the major difficulties in the construction and use of health 
attitude scales is the definition of the attitude or attitudes (predispo- 
sition to act) to be evaluated. Thus, in such scales it is often neces- 
sary to examine the entire scale item by item in order to get a “feel” 
for the attitudes implicitly represented by the statements present in 
the scale, A health attitude inventory (Inventory 4, Health Attitudes. 
Educational Testing Service, 1950, out of print) developed by the Co- 
operative Study in General Education is constructed around six major 
premises, each of which is sampled by ten specific statements in which 
the premise is implicit. These premises are: 


l. That qualified agencies should have a right to control in 
health problems or situations. 

2. That individuals have a responsibility to fulfill in main- 
taining at a maximum their own health, as well as the 
health of others. 

3. That good, adequate, and buoyant health requires the 
greatest concern for bodily functions, for protection from 
dangers of bacteria, etc. (A very high acceptance of this 
assumption by a person might be considered as indicating 
hypochondriacal tendencies.) 

4. That personal appearance, ruggedly manly physique, is 
the most important element of health. (This premise as- 
signs high value to “big muscles," “erect posture,” “tall, 
dark and handsome” men and the like. ) 

5. That there is a purpose in nature to maintain health at a 
maximum, especially if nature is "uncontaminated" by 
man's activity. 

6. That there is one right way in health practices or prob- 
lems and that any appreciable departure from norms or 
standards of health or bodily functions is undesirable. 


406 Evaluating Major Objectives and. Situations 


Desirable health attitudes in this device are assumed to be accept- 
ance of the first two premises and a rejection of the last four as meas- 
ured by the reaction to the individual statements in which the prem- 
ises are implicit. 


Health Conditions and Procedures Almost always, evaluation in this 
area is conducted by means of the "checklist" technique. In this tech- 
nique a series of questions pertaining to health conditions and pro- 
cedures are asked. Such questions deal with the existence of health 
conditions, responsibility in the particular procedures, number of stu- 
dents or pupils involved, methods of record keeping, procedures for 
informing authorities, etc. 

The structure or classification of the specific items to be included 
in such checklists depends to some extent on the use of such evidence 
or information. A device * developed for a survey of health and phys- 


ical education programs in secondary schools is divided into the fol- 
lowing major topics: 


I. Description of school and community. 


II. Organization and administration of the school health 
program. 


III. Hygiene of the school program. 
IV. Health supervision. 
V. Hygiene instruction. 
VI. Hygiene of environment. 
VII. Physical education—boys. 
VIII. Physical education—girls. 


Other checklists devised to obtain evidence concerning practices 
involved in the achievement of the health objective cont 
classifications. 

A major consideration in the selection or development of a check- 
list for the purpose of assessing the validity of health conditions and 
procedures is the use of such information. If the purpose of the infor- 
mation is to obtain a rather general description of health conditions 
and procedures in the school and the community or to check on the 
most important of such conditions and procedures, the amount of 
detail to be included in the checklist will probably be small. On the 
other hand, if there are particular areas of health conditions 
cedures which are strongly suspect 


ain similar 


and pro- 
and concerning which major re- 


4T. H. Dearborn, A Check List for the Survey of Health and Ph 


1 ` ysical Education 
Programs in Secondary Schools. Stanford University Press, 1940. 


Evaluating Health and Physical Development 407 


visions are likely to occur, the amount of detailed information to be 
obtained by means of the checklist is likely to be great. A good ex- 
ample of the broader type of checklist is that for surveying the sec- 
ondary-school program developed by the Michigan Department of 
Public Instruction. An example of a checklist developed for the pur- 
pose of obtaining very detailed information pertaining to a health 
condition and practice is “A Yardstick for School Lunches” developed 
by the United States Department of Agriculture. Table 25 includes 
some of the checklists which are available in the area of health 
instruction. 


Illustrative Checklists 


TABLE 25 
on Health Conditions and Procedures 
TEST AND PUBLISHER DATE CONTENT 
An Appraisal Form for Local 1938 General checklist 


Health Work 
(American Public Health 
Association, N. Y.) 


Check List for Safety and 1939 Safety conditions and proce- 
Safety Education dures in the school environ- 
(National Educational Asso- ment 

ciation, Washington, D. C.) 

A Check List for the Survey 1940 Detailed list of conditions and 
of Health and Physical Edu- procedures pertaining — to 
cation in Secondary Schools health in secondary schools 
(Stanford University Press) 

A Yardstick for School 1948 Detailed checklist on condi- 
Lunches tions and procedures related 
(U. S. Office of Education, to school lunches 
Washington, D. C.) 

A Check List for Surveying 1946 General checklist on condi- 
the Secondary School Health tions and procedures pertain- 
Program ing to health in the secondary 
(Michigan State Depart- schools 

ment of Education) 

Teachers Inventory of Their 1941 Checklist of health instruction 
Health Education Activities activities 


(Strong and Smiley: The Role 

of thé Teacher in Health Ed- 

ucation, The Macmillan Co., : xi 
New York) 


408 Evaluating Major Objectives and. Situations 


The Outcomes of the Use of Evaluation Procedures 


The use of the kinds of devices described above leads to the collec- 
tion of different kinds of evidence pertaining to the overall health ob- 
jective. Such evidence can obviously be summarized either as an in- 
dication of the variations which exist between the individual topics or 
aspects of the health objective, or as a guide to instructional policies 
and procedures. The judgments resulting from a consideration of such 
evidence and summaries of evidence require the application of certain 
standards or certain assumptions concerning what is desirable or un- 
desirable. 

Some of the questions which are listed below can be answered by 
evidence alone; others go beyond the question of evidence alone and 
assume that standards or assumptions concerning the level of attain- 
ment of the health objective are either in hand or can be determined. 


Questions Pertaining to the Achievement of Groups of Pupils in 
Objectives of Instruction 


1. Is there achievement in the instructional health objectives in the 


school year by year? 

In what ways (in which objectives) are groups year by year 

showing achievement or lack of achievement? For example, is the 

achievement primarily one of information, one of improved prac- 
tices, one of sharpened interests, etc.? 

8. How does the achievement in the health objectives of groups— 
for instance, students with science interests—compare with that 
of groups where there is little relevance of the subject to prob- 
lems of health? 

4. How does the achievement of groups or the school as a whole 
compare with that shown by other schools? 

5. To what extent is there consistency in the group or groups in 
growth in health practices, health information, health interests, 
and health attitudes? 

6. How does the achievement of students as a group with intensive 
attention to the health'objective in instruction compare with that 
where little or no attention is given to the health objective? 


2. 


Questions Concerning the Achievement of Individual Students in 
Health Instruction 


1. Is the achievement of health objectives for individual students 
markedly different or somewhat uniform? 


Evaluating Health and Physical Development 409 


2. What students (what kinds of students) reveal the greatest over- 
all achievement in health objectives? (What kinds reveal the least 
achievement?) 

8. Are students becoming more certain about their information in 
regard to health than the correctness of such information indi- 
cates they should be? 

4. How are the students’ interests in health problems related to 
their practices and their information? 


Questions Involving Specific Aspects or Content of the Health 
Objective 


1. What kinds of health problems or situations are of greatest inter- 
est to students of different age levels or grades? 

2. About what specific elements of health information are students 
either ignorant or grossly misinformed? 

8. In what kinds of health problems are students’ practices, their 
information and their interests most inconsistent? 


Questions Concerning School and Community Health Conditions 
and Procedures 


1. To what extent are specific conditions and procedures involving 
health consistent with practices, information, and attitudes sought 
for in health instruction? 

2. Are health conditions and procedures being maintained at the 
limit specified by law? 

8. Are there health conditions and procedures for which no person 
has responsibility or authority, or for which authority is so di- 
vided that action is ineffective? 


Summary 


Evaluation procedures employed in the evaluation of pupil health 
and physical development include observation, physical measurement 
of growth and development, performance tests of motor and sensori- 
motor proficiency, and paper-and-pencil tests and questionnaires. 
Teacher observation is of value for identifying physical difficulties 
which become apparent during the course of the school day, and in 
the detection of visual and auditory defects through the use of simple 
Screening devices. Repeated measurements of height and weight, when 
interpreted in the light of the child's physical constitution, offer a sat- 
isfactory means of charting the child's growth and development. 


410 Evaluating Major Objectives and Situations 


Paper-and-pencil tests have been developed in several areas. In- 
struments are available for measuring health information, health prac- 
tices, health attitudes, and health interests. Checklists for use in eval- 
uating health conditions and procedures in a school or a community 
are also available. 


Problems for Class Discussion 


1. Assume that, as the teacher of a fourth-grade class, you have been asked 
to identify children in your group who might need medical care. Pre- 
pare a checklist which you might use in order to make sure that you have 
not overlooked any important diagnostic signs of pupils’ health needs. 

2. Arrange for the administration of a typical health information inventory 
to a high-school class. Prepare an analysis of the results in order to 
identify those areas in which the class needs special guidance. 

8. Prepare, from your reading and from interviews with school personnel, 
a comprehensive outline of a good school health program. For each 
aspect of such a program, show what evaluative instruments or tech- 


niques could be used to determine the success of the program in achieve- 
ing its stated aims. 


References Cited. in This Chapter 


l. Baldwin, B. T., and Wood, T. D., Height-Weight-Age Tables for Boys 
and. Girls of School Age. New York: American Child Health Association, 
1923. 

2. Bovard, John F., et al., Tests and Measurements in Physical Education. 
Philadelphia: W. B. Saunders Company, 1949. 

8. McCloy, Charles H., Tests and Measurements in Health and Physical 
Education. New York: F. S. Crofts and Company, 1944. 


4. Pryor, H. B., Width-Weight Tables, for Boys and Girls from 1—17 Years. 
Stanford: Stanford University Press, 1940. 


5. Simmons, Katherine, "The Brush Foundation Study of Child Growth and 
Development. I. Physical Growth and Development," Society for Re- 
search in Child Development Monographs, Vol. IX, No. 1, 1944. 

6. Rugen, Mabel E., and Nyswander, Dorothy, The Measurement of Un- 
derstanding in Health Education. In The Measurement bf indonttédding. 
45th Yearbook, Part I, National Society for the Study of Education, 1946, 
p. 215-219. 

7. Sontag, L. W., and Reynolds, E. L., “The Féls Composite Sheet. I. A 
Practical Method for Analyzing Growth Progress," Journal of Pediatrics, 
26:327-835, 1945. 

8. Wetzel, Norman C., Grid for Evaluating Physical Fitness, Cleveland: 
N.E.A. Service, Inc. 


Evaluating Health and Physical Development 411 


References for F urther Reading 


American Association of School Administrators, “Health in Schools” (Twen- 

tieth Yearbook). Washington, D.C.: National Education Association, 1942. 

A comprehensive analysis of the total health education program of 

the school; excellent as background for an understanding of the place 
of evaluation in the school program. 


Larson, Leonard A., and Yocom, Rachael, Measurement and Evaluation in 
Physical, Health, and Recreation Education. St. Louis: C. V. Mosby Co., 
1951. 

This book is a highly comprehensive and advanced treatment of meas- 
urement and evaluation, and includes excellent bibliographic material for 
specialized aspects of the field of health and physical education. 


Monroe, Walter S., editor, Encyclopedia of Educational Research. New 
York: The Macmillan Co., 1950. 

The article on Physical Education—Measurement, p. 835-842, presents 
an overview of materials in physical education not discussed in this 
chapter. The interested student will find many suggestions for further 
exploration. 

National Society for the Study of Education, The Measurement of Under- 
standing (Forty-fifth Yearbook, Part 1). Chicago: University of Chicago 
Press, 1946. 

Chapter XI, by Mabel E. Rugen and Dorothy Nyswander, presents a 
comprehensive analysis of the objectives of health education and suggests 
procedures for use in evaluating pupil understanding at various grade 
levels. 


E valuating 


CHAPTER TWENTY-TWO : ] 
Socio-Economic Status 


In large measure, the evaluation techniques which 
have been considered in previous chapters have dealt with qualities 
or dimensions of the pupil. Thus, it is true that the use of measures 
of achievement, personal-social development, attitudes, and interests 
provides the teacher with a better understanding of the individual. 
Such measures, however, do not encompass all of the significant fac- 
tors which enter into the functioning of individuals and groups in the 
classroom. Knowledge of the home and community background of 
the child can also make a vital contribution to individual and group 
understanding and guidance. This chapter considers one of the im- 
portant aspects of the pupil's background—socio-economic status. 


The Effect of Socio-Economic Status 


PUPIL ACHIEVEMENT AND SOCIO-ECONOMIC STATUS 


Early in the history of the scientific movement in education, pupil 
achievement in basic skills was related to such factors as level of in- 
telligence and physical health. More recently, considerable attention 
has been given to the possible relation of achievement to personal ad- 
justment. However, some educators have also tended to accept as a 
fact that social class status was significantly related to achievement 
in school, although there has been little research evidence in the field. 
For the most part, such studies as are available have dealt with school 
“leavers” rather than with school failures. 

While a direct relationship between socio-economic status and 
achievement in school is not clearly demonstrable, recent studies sug- 
gest that the content of tests used to appraise achievement have a 
social class bias. Eells (4) and Davis and Havighurst (2), for exam- 


ple, feel that many specific items in tests of different kinds reflect 
412 


Evaluating Socio-Economic Status 413 


those experiences which are common to children drawn from families 
of upper and upper-middle social class status. 


PUPIL ATTITUDES AND SOCIAL CLASS STATUS 


Although considerable attention has been given to the development 
of many different kinds of attitude scales devised for the purpose of 
determining the presence or absence of both desirable and undesir- 
able attitudes of different kinds, relatively little work has been done 
on the problem of how such attitudes are related to socio-economic 
status or social class status. The way in which social class operates in 
relation to children's attitudes may be seen in Neugarten's (7) study 
of fifth- and sixth-grade children in a middle western community. 
Neugarten found that, as a group, children of the upper-middle and 
upper classes tended to be rated highly by other children in such 
traits as friendship, leadership, good looks, and other favorable per- 
sonal characteristics. Children drawn from lower classes were ranked 
low, and were said to be bad looking, dirty, and not wanted as friends. 

It is entirely possible, too, that the “keying” of attitude scales may 
itself be a function of social class status, or that the specific items or 
examples relevant to an attitude may be selected largely from a 
middle-class or upper-class point of view. The analysis of the results 
of attitude scales against the social class status of the pupils should 
reveal both significant information concerning the validity of such 
scales for use with different social class groups and clues about the 
blocks and barriers to achievement in desirable directions. 


SOCIAL CLASS MOBILITY AND VOCATIONAL GOALS 


Many different studies have revealed that occupational status seems 
to be the most significant single factor determining social class status. 
This fact, associated with the tendency for an upward mobility in the 
social class structure, indicates the importance of collecting evidence 
pertaining to social class status of pupils, particularly by those teach- 
ers who are charged with guiding pupils in their vocational choices. 
Such evidence is very important, in addition to or in relation to other 
evidence pertaining to the abilities of pupils, when engaging in coun- 
seling and semi-direct instruction for the purpose of selecting a voca- 
tion. The conflict between the pupil's desire for upward social mobility 
and the need for taking into account the native ability of the pupil 
in the selection of a vocation is probably one of the most significant 
dilemmas the teacher faces, particularly at the junior- and senior- 
high-school levels. 


414 Evaluating Major Objectives and Situations 


SOCIAL CLASS AND SCHOOL ORGANIZATION 


Warner, Meeker, and Eells in Social Class in America (13) note 
that: "Teachers . . . it must be said, although one of the most demo- 
cratically minded groups in America, tend to favor the children of the 
classes above the Common Man and to show less interest in those 
below that level. Studies in the Deep South, New England, and the 
Middle West indicate that they rate the school work of children from 
higher classes in accordance with their family's social position and 
conversely give low ratings to the work of lower-class children." 

As an example of the way in which the social position of children 
may influence the sectioning practices on a grade, the following data 
from a school which follows a policy of homogeneous grouping may 
be cited (12). The principal of the school, using teacher estimates of 
pupil ability as a guide, divided a group of 103 girls into three sec- 
tions, Section A ostensibly including the best and Section C the poor- 
est subgroup. An analysis of class composition in terms of social class 
revealed that of ten upper-class girls, eight were placed in Section A, 
one in B, and one in C. Of seven upper-middle-class girls, six were 
assigned to Section A and one to B. Of 33 lower-middle- and inde- 
terminate-middle-class girls, 21 were placed in Section A, 10 in B, and 
2 in C. Of 53 lower-class girls, only six were assigned to Section A, 
28 to B, and 19 to C. 

While it is true that factors other than the social class status of 
pupils were probably operative in forming these class groups, the high 
relationship between status and placement in a given group leads one 
to wonder whether assignments to sections were made solely on the 
basis of achievement in school work. 

Section groupings in which patterns of class organization parallel 
closely the existing class structure of the community may be noted 
in many schools. The need for a reexamination of present practice in 
the light of our democratic ideal is obvious. 


Determining Social Class and. Status 


SOME GENERALIZATIONS REGARDING SOCIAL CLASS 


Most of the studies of social class in the United States have dealt 
not with country-wide class organization, but with class Structure and 
social mobility in towns and cities in several regional areas (I, 3,5, 
6, 11). On the basis of these studies, Warner and Lunt (10) have ad- 
vanced the following significant generalizations regarding social class: 


Evaluating Socio-Economic Status 415 


1. Some form of socio-economic class or social rank exists in 
every American community. 

9. Societies such as these represented by American communi- 
ties must have some rank order of social classes to perform 
those functions necessary for survival. 

8. In large and complex populations, values are assigned by 
all classes placing people in higher- or lower-class posi- 
tions. 

4. The social class structure becomes more complex with in- 
creasing complexity in the technological and economic 
structure. 

5. The number and nature of the classes necessary for de- 
scribing the class structure of a community depends upon 
the size, age, geographical location, and technological- 
economic condition of the community. 


Basically, the determination of social class status involves the identi- 
fication of patterns of statements and beliefs of the individuals com- 
prising the community. The detection of such patterns is essentially a 
type of sociological analysis. 


THE DIRECT APPROACH 


Warner, Meeker, and Eells (13) describe six techniques which, 
taken together, are used to reveal the patterns comprising the social 
classes of a community. The six techniques are: 


l. Rating of classes by matched agreements—in interviews 
with informants of diverse social backgrounds, the analyst 
obtains a configuration of social classes in the community 
and names of individuals in each class. When the count 
of matched agreements of informants on the class position 
of a large number of people is high, the analyst assumes 
that the class ratings assigned to the individuals listed is 
accurate. 

2. Rating by symbolic placement—an individual is rated as 
being in a particular social class because he is identified 
with certain superior or inferior symbols by informants. 

8. Rating by status reputation—an individual is assigned to a 
particular clàss because he is reported to engage in activi- 
ties or possess certain traits which are looked upon as 
superior or inferior by informants. 

4. Rating by comparison—an individual is assigned to a given 
class because informants feel that he is inferior, equal, or 

--. „superior to others whose status has been determined. 
-© 5. Rating by simple assignment to a class—an individual is 


416 Evaluating Major Objectives and. Situations 


rated as being in a particular class because informants 
assign the individual to only that class in the entire system 
of classes. 

6. Rating by institutional membership—an individual is as- 
signed to a given class because informants regard him as 
a member of certain institutions (cliques, associations, 
etc.) which are ranked as superior or inferior. 


In the six kinds of ratings, the social status analyst obtains informa- 
tion from many different kinds of individuals in a community con- 
cerning other members of the community. The ratings given in inter- 
views lead not only to a picture of the number and nature of the vari- 
ous social classes comprising the community's population but also to 
the membership of those individuals interviewed or “interviewed 
about" in the social classes so identified. This direct process for de- 
termining social class status is called "evaluated participation." 

The description of the social classes of a community by statements 
obtained in the direct process outlined above is given in Table 26. 


TABLE 26* | Social Classes in a Community 


CLASS DESCRIPTION 


A group founded on wealth and ancient 
family 
People who look down on everyone else in 
town 
I. Upper Snobs 
The silk stockings 
The landed gentry 
The aristocrats 


The Mainstreeters 
Above the 


Common Man 


Not in the top group, but good substantial 
people 
The level just below the top group 
IL. Upper Prominent but not tops 
x he strivers 
Middle n š 
People who are in everything 
The community leaders 
Working hard to get in the 400 
Above average but not tops 
V —— —— M À—ÀQÀ 
i Adapted from W. Lloyd Warner. Marcia Meeker, and Kenneth Eells: Social 
Class in America. Chicago. Science Research Associates, 1949, p. 67 


Evaluating Socio-Economic Status 417 


CLASS DESCRIPTION 
—— POSU (a MM 
Average people 
Ordinary people 
Working people, but superior 
Top of the working people 
Not poor and not well off 
Good common people 
People with nice families but don't rate 
socially 
Nobodies (socially) but nice 
People just below the Country Club crowd 
Top of the common people 


III. Lower 
Middle 


Common 
Man 


The Mill people 

The poor but honest 

Poor people but nothing the matter with 
them 

The little people 

Poor but hard working 

Poor but respectable 

"We're poor (UL) but not as poor as a lot 
of people (LL)" 


a M————————— 


IV. Upper 
Lower 


The poor and unfortunate 

The chronic reliefers 

Tobacco road 

Poor whites 

Hill-billies 

River rats 

Peckerwoods 

Dirty and immoral 

People who scrape the bottom 


Below the V. Lower 
Common Man Lower 


a 


DETERMINING SOCIAL CLASS STATUS BY 
EMPLOYING STATUS CHARACTERISTICS 

The determination of social class status by this direct process is not 
only a time-consuming process but also one requiring special training. 
It becomes desirable to find a method for determining social class in 
which the evidence is both easier to collect and to analyze. The basic 
problem here is the same as that encountered with other’ testing 
devices, e.g., attitude scales, in which a simpler device is sought capa- 
ble of accurately predicting the results which would be obtained by 
the use of the basic but more complex device or method. The results 
obtained by the basic “evaluated participation” procedures may be 


418 Evaluating Major Objectives and Situations 


considered as the criterion when developing other more simple meth- 
ining social class status. 

ee Sod in the determination of social class by the 
evaluated participation method includes information concerning many 
different characteristics of the individuals involved. The task in the 
development of a good predicting procedure becomes one of shrewd 
selection of the most significant of these characteristics and the proper 
combination or weighting of them to yield the same results as those 
obtained by the more complex approach. 

Warner, Meeker, and Eells have developed such a formula which 
predicts social class status accurately in more than 90 in 100 cases. The 
formula includes four major factors: occupation, source of income, 


house type, and dwelling area. The authors give very detailed descrip- 
tions of each of the categories. 


OCCUPATION SOURCE OF INCOME 
Score Score 
l Professionals and proprietors 1 Inherited wealth. 
of large businesses. 2 Earned wealth. 
2 Semi-professionals and smaller 3 Profits and fees. 
officials of large businesses. 4 Salary. 
3 Clerks and kindred workers. 5 Wages. 
4 Skilled workers. 6 Private relief. 
5 Proprietors of very small 7 Public relief, 
businesses. 
6 Semi-skilled workers. 
7 Unskilled workers. 
HOUSE TYPE NEIGHBORHOOD 
Score Score 
1 Excellent houses. l Most exclusive section of town. 
2 Very good houses. 2 Area well above average. 
8 Good houses. 3 Area “nice and respectable” 
4 Average houses. but not inhabited by society. 
5 Fair houses. 4 “Average” neighborhood pop- 
6 Poor houses. ulated mainly by working men. 
7 Very poor houses. 5 Area close to industry or rail- 
road and all kinds of people 
live there. 
e Edge of slum. 


Strictly a slum area. 
The Warner formula may be written as 


4 X Occupation Score = 
3 X Source of Income Score = 


= 


Evaluating Socio-Economic Status 419 


3 x House Type Score = c 
2 x Neighborhood Score — 
Total score — Social class status score. 


The total class status score is translated into its social class equiva- 
lent by the following table: 


TOTAL SCORE SOCIAL CLASS EQUIVALENT 
12-22 Upper class 
25-34 Upper-middle class 
37-50 Lower-middle class 
54-63 : Upper-lower class 
67-84 Lower-lower class 


Warner, Meeker, and Eells report that two other characteristics, 
education and amount of income, have also been considered as a part 
of the formula. Their computations reveal, however, that these char- 
acteristics as such add so little that their inclusion is not worthwhile. 
According to the formula as given, occupation is the most significant 
characteristic for determining social class status. 


OTHER DEVICES FOR DETERMINING SOCIAL CLASS STATUS 


Socio-economic status or social class status has been estimated by 
other status characteristic devices. The better known of these devices 
are: 


l. Sim’s Score Card for Socio-Economic Status. 
2. Minnesota Home Status Index. 
3. Kerr-Remmers American Home Scale. 


The three devices indicated above and others employing essentially 
the same method, have been developed on the assumption that certain 
characteristics differentiate social classes. The procedure employed in 
the development of such devices has been to postulate certain status 
characteristics—the possession of automobiles and bath tubs, obtained 
dental service, etc.—and to compare the data so obtained with an esti- 
mated socio-economic status of the community from which the data 
were drawn. 


Sim's Score Card for Socio-Economic Status * 


This device is a questionnaire consisting of twenty-three questions, 
several of which have sub-parts. The questions are of such a nature 
that children of the fourth grade or higher are assumed to know the 


? Public School Publishing Co., Bloomington, Ill. 


420 Evaluating Major Objectives and Situations 


meaning of the questions and possess the necessary information con- 
cerning the home to answer them. "m 
The information sought in the score card deals with home facilities, 
education of parents, cultural activities of pupil and parents, books 
and magazines in the home, size of home, and parents' occupations. 
The responses to the questionnaire are scored by a key which is re- 
duced to a quantitative scale. Provision is made for correcting the 
scores of pupils who omit one or several of the questions. The score 
deduced is directly related to socio-economic status, high scores corre- 
sponding to high status and low scores corresponding to low status. 
The scoring of the Sim's questionnaire data pertaining to parents’ 
occupations is done in terms of an occupational classification consisting 
of five major groups: (a) Professional, executive, and proprietor of 
large businesses, (b) Commercial service, clerical service, intermedi- 
ate business proprietorship, and land owner, (c) Artisan proprietor- 
ship, skilled labor with some managerial responsibility, shop owner, 
and small business proprietor, (d) Skilled laborer, (e) Unskilled la- 
borer. 
While research has indicated that the reliability and validity of the 
Sim's Score Card are adequate, the scale is difficult to score. It has 
been criticized, too, for stressing the economic and cultural aspects of 


the home, and neglecting to include any sampling of questions dealing 
with the home's aesthetic desirability. 


Minnesota Home Status Index ? 


This index is based on an interview with a 
interview questions are classified in the following groups: (a) Chil- 
dren's facilities, (b) Economic status of family, (c) Cultural activi- 


ties, (d) Social status, (e) Occupational status, and (£) Educational 
background of parents. 


parent in the home. Fifty 


Although the use of an interview approach makes it possible to use 
the Index on any grade level, the investment of time involved in a 
visit to each home precludes widespread group appraisal When cost 


in time and personnel is not a major consideration, the use of the 
Minnesota Home Status Index may be profitable. 


Kerr-Remmers American Home Scale + 


This scale, to be answered by pupils of the sixth grade or above, 
contains fifty items classified into cultural, esthetic, economic, and 
3 University of Minnesota Press, Minneapolis, Minn. 

4 Science Research Associates, Chicago, Ill. 


Evaluating Socio-Economic Status 421 


miscellaneous sections. An attempt was made in the development of 
the scale to make clusters of items as consistent as possible among 
themselves but as independent as possible from other clusters. This 
statistical approach serves to minimize the overlapping among the 
four sections of the scale, thus increasing its diagnostic value (9). 

The American Home Scale is much easier to score than the Sim's 
Score Card, and includes a greater range of items. However, it has 
been criticized as being applicable only to homes in urban com- 
munities. 


Uses of Knowledge of Socio-Economic Status 


Two of the most important questions which the democratically- 
minded teacher will be called upon to answer honestly are raised by 
Raths and Abrahamson (8) in their provocative discussion of student 
status and social class: 


l. Is the structure, or organization of subgroups and cliques 
of pupils related to social class status? If so, in what ways? 
2. Is school promotion and participation in class activities 
or offices related to social class status? If so, in what ways? 


The assumption underlying the suggestion that such questions be 
investigated is that important information will be revealed which will 
be helpful in developing the worth and dignity of all pupils. There 
is no assumption that one social class is better than another or that 
any social class should enjoy more privileges than another or that one 
Should be helped more than another. 

The existence of pupil groupings or cliques is a fact which confronts 
every teacher. Often such groupings or cliques are criticized as hav- 
ing no place in a democracy. However, psychological research reveals 
that there are good reasons for the existence of the grouping or 
cliquing activity, reasons which often have little or no relationship to 
democratic procedures or assumptions. A pupil may be attracted to 
a group or a clique because, either consciously or unconsciously, he 
sees that by participating as a member of the group or clique, certain 
of his basic needs, such as the need for belonging or recognition, may 
be more adequately met. 

If the teacher can obtain reasonably valid evidence concerning the 
Social class structure of sub-groupings or cliques of pupils and at the 
same time obtain evidence pertaining to their personal-social adjust- 
ment and the emotional needs of individual members of the sub- 


422 Evaluating Major Objectives and Situations 


groups or cliques, he is then in a better position to make suggestions 
or take action to develop sound inter-personal relationships. : 

Raths and Abrahamson have listed specific suggestions concerning 
the kind of evidence to be obtained regarding the organization of sub- 
groups and cliques as revealed in seating arrangements, playground 
activities, and lunchroom activities. They have also indicated pro- 
cedures which may be used by a teacher interested in determining 
the nature of the social class status of pupils in relation to grades and 
promotion, school prizes and awards, membership in appointive of- 
fices and elective offices, and participation in extracurricular activities. 

"Teachers, like pupils, are identified with particular social classes. As 
such they tend to reflect the values accepted by the particular social 
class of which they are a member. Such acceptance of values may be 
conscious, although it is more likely to be unconscious. Teacher self- 
evaluation is particularly important in this area. 


Summary 


Techniques available for measuring socio-economic and social status 
include paper-and-pencil questionnaires filled out by students, an in- 
dex developed through the use of an interview technique, and a for- 
mula based on a sociological survey of a number of communities. 

A knowledge of the socio-economic status or social class status of 
his pupils is of great value to the classroom teacher. Not only does 
such knowledge provide an additional factor of importance in indi- 
vidual and group guidance, but it serves as a means of providing the 
teacher with added insights into his own behavior and attitudes. 


Problems for Class Discussion 


1. Apply the Warner formula and one of the other scales described in the 
text to ten families which you know fairly well. To what extent do the 
results agree? To what factors may disagreement be attributed? 

2. Using the techniques described in Cha: 
for an elementary-school class. To w 
bership appear to enter into the form 


pter Eleven, develop a sociogram 
hat extent does social class mem- 
ation of subgroups in the class? 


References Cited in This Chapter 


1. Davis, Allison, Gardner, Burleigh B., and Gardner, Mary R., Deep 
South. Chicago; University of Chicago Press, 1941, 


Evaluating Socio-Economic Status 423 


2. Davis, Allison, and Havighurst, Robert J., "The Measurement of Mental 
Systems (Can Intelligence Be Measured)," Scientific Monthly, LXVI, 
301-316, 1948. 

8. Dollard, John, Caste and Class in a Southern Town. New Haven: Yale 
University Press, 1937. 

4. Eells, Kenneth W., Social-Status Factors in Intelligence-Test Items. 
Unpublished Ph.D. dissertation, University of Chicago, 1948. 

5. Flint, John, Hometown: A Study of Education and Social Development. 
University of Chicago, 1940. 

6. Lynd, Robert, and Lynd, Helen, Middletown in Transition. New York: 
Harcourt, Brace and Co., 1937. 

7. Neugarten, Berenice L., "Social Class and Friendship Among School 
Children," American Journal of Sociology, LI, 305-313, January, 1946. 

8. Raths, L. E., and Abrahamson, S., Student Status and Social Class. 
L. E. Raths, Box 26, Bronxville, N. Y. 

9. Remmers, H. H., and Gage, N. L., Educational Measurement and Eval- 
uation. Revised edition. New York: Harper & Brothers, 1955. 

10. Warner, W. Lloyd, and Lunt, Paul S., The Social Life of a Modern 
Community. Vol. I, “Yankee City Series.” New Haven: Yale University 
Press, 1941. 

11. Warner, W. Lloyd, and Lunt, Paul S., The Status System of a Modern 
Community. Vol. II, "Yankee City Series.” New Haven: Yale University 
Press, 1942. 

19. Warner, W. Lloyd, Havighurst, Robert J., and Loeb, Martin B., Who 
Shall Be Educated? New York: Harper & Brothers, 1944. 

18. Warner, W. Lloyd, Meeker, Marcia, and Eells, Kenneth, Social Class 
in America. Chicago: Science Research Associates, 1949. 


References for Further Reading 


It is strongly recommended that the student read at least one of the descrip- 
tive community studies cited in this Chapter (References 1, 3, 5, 6, and 11 
above). 

Hollingshead, August B., Elmtown's Youth. New York: John Wiley and Sons, 
1949. 

An excellent analysis of various aspects of the school system in terms 
of the position of the adolescent's family in the social structure of the 
community. 

Pfautz, Harold W., and Duncan, Otis D., "A Critical Evaluation of Warner's 
Work in Community Stratification,” American Sociological Review, 
15:205-215, 1950. 

Warner, W. Lloyd, Meeker, Marcia, and Eells, Kenneth, Social Class in 
America. Chicago: Science Research Associates, 1949. 

This book presents the basic procedural elements of Warner's approach 
to an evaluation of social class structure. As such, it constitutes required 
reading for every serious student of social psychology. 


Evaluating School and 
Teaching Practices 


CHAPTER TWENTY-THREE 


Every school system and individual school needs, at 
intervals, a reexamination of its theory and practice, of its personnel 
and facilities, and of its educational agents, the teachers. The evalua- 
tion, conducted by means of many devices, should be an interrelated 
and ongoing process, one which becomes incorporated into continu- 
ously growing plans and actions. Some of the instruments which have 
been developed measure the achievement of specific curricular ob- 
jectives, while others are more in the nature of a broad listing of 
criteria for the evaluation of over-all program objectives. 

This chapter deals with some of the rating scales, inventories, ques- 
tionnaires, examinations, and techniques which have been designed to 
evaluate school and classroom practices. The emphasis is on evalua- 
tion of school programs at the elementary and secondary level and on 
evaluation of teaching practices and teachers. An attempt is made to 
survey the variety of instruments and devices for the variety of ob- 
jectives and aspects which need evaluation. Each measuring instru- 
ment is briefly described and illustrated. There is no extensive discus- 
sion of the issues of reliability and validity, nor any comparison of 
competing instruments. Such discussion, although important, is beyond 
the scope of the present treatment. 


Evaluative Criteria for School Programs 


In recent years evaluative criteria for assessing the total school pro- 
gram have been developed by local, state, and national organizations. 
The criteria are generally based upon the best current practices and 
research findings in educational philosophy, objectives, curriculum, in- 
structional methods, staff qualifications, and school Services. Using 
these criteria as norms, or reference points, a local School is evaluated 

424 


Evaluating School and Teaching Practices 425 


in terms of its philosophy, objectives, and practices. The instrument 
for evaluation and the procedures for its use place the emphasis upon 
stimulation to growth by the school. It provides a diagnosis of the 
strengths and weaknesses of the total school program. 

The following section deals with some of the scales, inventories, 
and questionnaires developed to evaluate classroom and school prac- 
tices. 


Morrison-Ruegsegger Scale for Rating Elementary-School 
Practice (8) 


This scale is designed to answer three questions related to class- 
room practice in New York State elementary schools: (a) Is the in- 
structional program of an elementary school in harmony with modern 
concepts of the educative process in American education? (b) To 
what extent does an elementary school succeed in attaining the prac- 
tice of the modern concepts? (c) In which modern concepts are the 
practices of an elementary school strong or weak? The scale may be 
employed both as a supervisory and as a survey instrument. It is ad- 
vised that the content and implications of this scale be studied in- 
tensively before it is used. Staff conferences and meetings between 
teachers and observers should be arranged for that purpose. There 
should be repeated trial observations for a minimum of one hour per 
classroom per day. 

The rating which takes place is based on fifty-eight items, each of 
which represents a concept or element of practice frequently found 
in a modern elementary school. There are five points to the scale, but 
ratings 2 and 4 are not defined or described. A consideration of one 
of the items on the scale will best illustrate the scale. 

The scale items are listed under the following areas: methods, ma- 
terials, atmosphere and environment, and relationships. Items in each 
of the areas mentioned are further organized into sub-classifications. 
For instance, practices under the general heading of methods are 
listed as planning, experiencing, and evaluating and keeping records. 

The illustrative item on the next page is characteristic in format of 
all other items in the scale. 

An observer, or rater, in a classroom must decide which of the three 
descriptions of class behavior is most representative of instructional 
activity. It is apparent that point 1 on the scale is indicative of an in- 
Structional practice that stresses uncritical gathering and learning of 
information. Point 3 on the scale is a blending of the formal and un- 
Critical approach with the more modern critical approach. Point 5 on 


426 


Evaluating Major Objectives and. Situations 


ITEM 5: Obtaining Information 


8 
| 


5 
| 


Children merely accept 
statements of the text, 
teacher, other adult or 
their fellows without 
discussion, question, or 
outward sign of critical 
thinking. 


Most children "look up" 
things only when di- 
rected to do so. There 
is some tendency to 
challenge printed or 
spoken statements. 
Some children show a 
tendency to find things 
out themselves by ask- 


Children employ ele- 
mentary research tech- 
niques in searching for 
information rather than 
passively accept state- 
ments; they perform 
elementary experi- 
ments; go to books and 
periodicals for informa- 


ing their elders, observ- tion on questions; 
ing situations, visiting check one authority 
places, or through dis- against another. Pri- 


cussion. mary as well as inter- 


mediate grade children 
insist on accuracy of in- 
formation regardless of 
source from which it 
is derived. 


the scale represents a very modern practice in obtaining and evalu- 
ating information as part of the learning process. 

After observing and rating all items on the scale, a profile may be 
drawn to show wherein the instructional practices tend to be formal 
and traditional, modern, or a blend of traditional and modern. 


New York Elementary School Inventory (15) 


The Elementary School Inventory has been devised to help those 
working in elementary schools to determine more clearly which ideas 
and corresponding practices they wish to continue and which ideas or 
practices they would like to discourage or eliminate. 

The inventory consists of a self-appraisal checklist and a support- 
ing digest of pertinent excerpts from publications of the New York 
State Education Department. 

The responses to the checklist represent the collective and group 
thinking of all those who are involved in the school program. It should 
be used at faculty meetings over a period of time after each member 
of the group has had the opportunity to study the lists, Each item on 
the list can be checked Yes or No under the headings of theory and 
practice. If the theory or practice of a certain item is not desirable or 


Evaluating School and. Teaching Practices 427 


representative for certain teachers, the number of those teachers op- 
posed is entered in the appropriate column. The same is done for the 
number who approve the theory or practice. Space for comment is 
provided after each section. An illustration of one section of the in- 
strument is given below. 


section 4. The Material and Natural Environment 


a. Arithmetic 
Theory Practice 
Yes No Yes No 
1. In the primary grades, emphasis is placed 
on rich experiences which develop num- 
ber readiness . $ 
2. Individual differences are recognized 
and adequate provision for individual 
development is made 
5. Effort is made to utilize children’s ex- 
periences to develop an understanding 
of number concepts 


There are four parts to the checklist, each of which is subdivided. 
The major sections are: (a) Guiding pupil growth through curricular 
experience; (b) Guiding pupil growth through organizing and imple- 
menting the educational program; (c) Guiding pupil growth through 
school, home, and community relations; (d) Guiding pupil growth 
through recording, evaluating, and reporting. 


McCall-Herring School Practices Questionnaire (7) 


This evaluative tool distinguishes itself from the rating scale and in- 
ventory hitherto mentioned in that the classroom practices are evalu- 
ated by the pupils themselves instead of by teachers or supervisors. 
It is intended for grades four to nine, restricted to curricular practices, 
and can only be administered after a class has been working together 
for four weeks or more. It is important that pupils understand the 
meaning of the task as well as be familiar with some of the terms 
used. The questionnaire consists of 105 items arranged in groups of 
five items. Each group deals with a specific element of the school pro- 
gram. An illustration of several items dealing with committee work is 
given below. 


In the last four weeks— 


56. Did YOU work in a committee or group 
of children chosen from two or more 
classes? Yes No 


428 Evaluating Major Objectives and Situations 


57. Did your teacher talk with YOU and 


others about how to choose a committee? Yes No 
58. Did your teacher help YOU to learn how 

to work in a committee or group? Yes No 
59. Did YOU and your teacher both suggest 

things for a committee to do? Yes No 


Evaluation of classroom practices by pupils assumes added signifi- 
cance if it is considered in conjunction with evaluation by teachers 
and supervisors. 


Elementary Evaluative Criteria 


Dr. James F. Baker, Boston University, supervised a committee in 
the formulation of Elementary Evaluative Criteria (1) for assessing 
the program of a modern elementary school. The basic format of state- 
ments of the guiding principles, checklist and evaluation items, and 
graphic summaries is similar to that in Evaluative Criteria for sec- 
ondary schools and is illustrated on subsequent pages of this chapter. 

The publication contains schedules and checklists for an evaluation 
of (a) philosophy and objectives of the elementary school, (b) kinder- 
garten program, (c) arithmetic, (d) arts and crafts, (e) health and 
physical education, (£) language arts, (g) music, (h) science, (i) 
social studies, (j) library services, (k) guidance services, (1) school 
plant, (m) school staff and administration. 

The administration of the Elementary Evaluative Criteria is similar 
to that for the secondary school. First, a self-evaluation is conducted 
by committees of the entire staff of the school, and each committee 
reports its findings to the entire staff. Second, the self-evaluation is 
checked by a visiting committee consisting of experienced and well- 
prepared professional personnel in the feld of elementary education. 

The criteria may be used in at least three general ways: (a) to study 
and improve an area or phase of elementary education by means of 
an in-service program, (b) to evaluate a single elementary school, and 
(c) to evaluate a group of elementary schools in a local school sys- 
tem. Although the emphasis is upon staff participation, effective results 


may be realized when citizen groups participate with the staff in co- 
ordinated group study activities. 


Other Instruments for Evaluating School and Classroom Practice 


Departments of education of the following states have developed 
evaluative devices and materials for use in appraising the general edu- 


Evaluating School and Teaching Practices 429 


cational program of the elementary school: New Jersey, Virginia, 
Pennsylvania, Arkansas, and Ohio. According to Shane (13), self- 
rating scales, inventories, and discussion manuals are used by the vari- 
ous education departments to define, illustrate, and show the use of 
evaluative criteria. 


Evaluative Criteria—Cooperative Study of Secondary School 
Standards (4) 


The Cooperative Study of Secondary School Standards has made 
available, both in 1940 and in 1950, a list of evaluative criteria for all 
aspects of secondary-school programs. The Cooperative Study had its 
origin in 1933 when it was organized to achieve some of the follow- 
ing purposes: (a) To determine the characteristics of a good sec- 
ondary school, (b) To find practical means and methods to evaluate 
the effectiveness of a school in terms of its objectives, and (c) To de- 
termine the means and processes by which a good secondary school 
develops into a better one. The materials published in 1940 consisted 
of three manuals: Evaluative Criteria, Educational Temperatures, and 
How to Evaluate a Secondary School. 

The 1950 edition of Evaluative Criteria contains forms, or sched- 
ules, for entering data and judgments about various aspects of a sec- 
ondary-school program. Two schedules provide basic information 
about the pupil population and school community as well as the edu- 
cational needs of youth. Pupil population data to be entered on the 
schedule include: enrollments and graduates, age-grade distribution, 
range of mental ability of pupils, reasons for withdrawal of pupils, 
educational and occupational intentions of pupils. Data regarding the 
community include occupational and educational status of adults, 
financial resources of the community, and agencies affecting education, 
such as libraries, museums, service groups, and recreational agencies. 
Educational needs of youth are reviewed in terms of such major ob- 
jectives of secondary education as learning to live with others, main- 
taining sound physical and mental health, preparing for vocations and 
avocations, thinking logically, expressing self clearly, and learning to 
live in the natural and scientific environment. The local secondary 
School may amend this list of educational needs to fit local conditions. 

Another set of schedules relates to the educational program of the 
Secondary school and provides sixteen individual schedules for such 
courses as art, business education, English, foreign languages, social 
stüdies, science, and similar subjects of instruction. Excerpts from the 
checklist schedule for English will illustrate the format in which the 


430 


Evaluating Major Objectives and. Situations 


criteria are presented. Section I—Organization is reproduced in full. 
For other sections, only a few sample items are given. 


Checklist 
jâ 


. Remedial, or 


I. ORGANIZATION 


English courses are re- 
quired of all pupils. 
( years are re- 
quired.) 


. Elective English courses 


are available. 


. Pupils are assisted by a 


qualified counselor or rep- 
resentative of the English 
department in selecting 
elective courses in English. 


. Remedial, or clinical, read- 


ing activities are available 
in addition to instruction 
in reading in regular 
courses. 

clinical, 
speech activities are avail- 
able in addition to instruc- 
tion in speech in regular 
courses. 


Evaluation 


. Grade lines are minimized 


by placing pupils in groups 
based on their English 
needs. 


. Individuals within a single 


class are grouped or identi- 
fied for differentiation nf 
teaching. 

English courses are organ- 
ized by themes or experi- 
ences with a minimum of 
emphasis upon type or 
chronology of English ma- 
terials. 


( ) a. To what degree are English courses or activities provided to meet 


() b 
( ) e 
Checklist 
(3 E 
( 2 
( JW 
(onu 


the needs of all pupils? 


How satisfactory are the time allotments for English courses? 
To what extent do the enrollments in English courses show that 


the needs of all pupils for instruction in En 


(List courses indic 


glish are being met? 


ating name of course, normal grade level, and 


number of pupils enrolled in each course for the current term.) 


Il. NATURE OF OFFERINGS 


A. Literature 


Opportunities are provided to develo 


both as a study procedure and as a lit 


. Skills in reading are taught only as needed and in relation to use. 
. Reading of classic and contemporary liter. 
tion to the reading of newspapers 


p skills essential to reading 
€rary experience. 


ature is required in addi- 


and periodicals. 
. Reading activities provide specific trainin 


| £ in reading different 
types of literature (e.g., fiction, nonfiction, E 


drama, poetry). 


Evaluating School and Teaching Practices 431 


III. PHYSICAL FACILITIES 
Checklist 


( ) 1. Classrooms are equipped with movable furniture which can be 
adapted to group activities. 

Bookshelves are provided in all English classrooms. 

Audio-visual equipment is available for use by English classes. 

A stage, equipped with a curtain, is available for use by English 
classes. 


6 ) 
C) 
& 


RII 


IV. DIRECTION OF LEARNING 


A. Instructional Staff 


(For data on preparation of individual staff members, see Section J, “Data 

for Individual Staff Members.") 

Checklist 

All members of the English staff 

( ) 1. Have had background preparation in literature for adolescents, 
in American and English literature, and in literature dealing with 
other nations. 

( ) 7. Have had preparation in methods of teaching English. 

8. Are acquainted with diagnostic techniques and remedial instruc- 
tion methods. 


B. Instructional Activities 


( ) 1. Instruction in English contributes to the school's objectives. 

( ) 2. Instruction is directed toward clearly formulated, comprehensive 
(or long-range) objectives of the English program. 

( ) 8. Specific instructional activities contribute to the comprehensive 
objectives of the English program. 


C. Instructional Materials 
Checklist 
( ) 1. A variety of textbooks and library books is available. 
( ) 2. Available textbooks and library books provide reading materials 
designed to assist in the attainment of instructional objectives. 
odicals, pamphlets, and 


( ) 8. A variety of such reading materials as peri 
newspapers is available for classroom use. 


D. Methods of Evaluation 
Checklist 


( ) 1. Evaluation of class and individual accomplishment is an integral 
part of the teaching-learning activities. 


432 Evaluating Major Objectives and. Situations 


( ) 2. A variety of testing techniques is used (e.g., standardized tests, 
teacher-made objective tests, essay examinations). 


( ) 3. Efforts are made to improve the marking of essay examinations. 


Similar checklists are provided for the pupil activity program, li- 
brary services, guidance services, school plant, and school staff and 
administration. The checklists consist of provisions, conditions, or 
characteristics found in good secondary schools. All of them may not 
be necessary, or even applicable, in every school. A school may, there- 
fore, lack some of the items listed but have other compensating fea- 
tures. The checklists are intended to provide the factual bases for 
the evaluations. 

The use of checklists requires five symbols. (1) If the provision 
called for in a given item of a checklist is made extensively, mark the 
item in the parentheses preceding it with the symbol //, double 
check; (2) if the provision is made to some extent, mark the item with 


the symbol /; (3) if the provision or condition is made to a very 
limited extent, mark the item with the symbol "X"; (4) if a provision 
is missing but is needed, mark the item with the symbol “M”; (5) if 
any provision or condition is missing and is not desirable or appropri- 
ate for the school, mark such item with the symbol "N." 


Self-Evaluation and Visiting Committee Evaluation 


Based upon extensive experience, organizations which have used the 
Evaluative Criteria recommend that a secondary school is best eval- 
uated by making a self-evaluation using the Evaluative Criteria and 
having this self-evaluation checked by a visiting committee composed 
of experienced and well-prepared professional workers in the field of 
education. In the self-evaluation phase, it is suggested that: 


a. The entire staff of the school should participate in the 
evaluation. 


b. At the outset, a steering committee should be appointed 
to organize the entire self-evaluation. 


Subject area committees and major section. committees 
should be organized. 


d. Periodic and well-organized meetings of the committees 
should be held. Many schools have spent five or six 
months in conducting their self-evaluation. 

e. At the close of the self-evaluation, all committees should 
report their findings to the entire staff. 


Evaluating School and Teaching Practices 433 


The visiting committee of experienced professional educators is or- 
ganized into subcommittees, similar to those of the school staff. After 
visitation and observation, the visiting committee compares its eval- 
uations with the self-evaluations of the school staff and may modify 
these reports if evidence indicates that such modification is necessary. 
The visiting committee concludes with an oral report to the entire 
school staff. Subsequently, the chairman of the visiting committee 
prepares a written report. 


Evaluation of Classroom Climate 


Careful observers with a knowledge of the dynamics of group ac- 
tion can assess, in a crude way at least, the social climate of a class- 
room without the use of rating instruments. In one classroom they may 
discern a warm and friendly climate, in another a temperate and co- 
operative climate, and in another a cool and unfriendly climate. Since 
the personal and social development and growth of children is an im- 
portant objective of education, it is essential to study more system- 
atically the effects of social climate on the behavior of pupils in a 
classroom. 

An early study on the climate of classrooms was conducted by 
Wrightstone (16). Data on classroom climate was obtained through 
a controlled-observation or time-sampling technique. A trained ob- 
server made notations of pupil behavior in the classroom. Each time 
a pupil engaged in a specified activity, the observer wrote the coded 
symbol for this activity next to the pupils name. During the total 
period of observation, any given pupil might engage in specified activ- 
ities under observation any number of times. In addition to the quanti- 
tative data thus obtained, the observers made notes or anecdotal rec- 
ords of sample situations, activities, experiences, and expressions for 
each pupil in relation to each defined activity. The accumulated ma- 
terial was then rated by a jury. 

The types of defined activities were summarized in these categories: 
(a) self-initiated activities, such as voluntarily bringing clippings, ex- 
hibits, books, etc., for school activities; (b) cooperative activities, such 
as helping other pupils or teachers and offering help or materials; (c) 
critical activities, such as criticizing or praising the work of others; 
(d) leadership activities, such as organizing, directing, or controlling 
new combinations of persons and things; and (e) work-study activi- 
ties, such as using time wisely, working efficiently, and clearing away 
materials. 


434 Evaluating Major Objectives and Situations 


In addition to the pupil behavior code, a similar observation code 
regarding teacher activities was employed. Evaluating the social cli- 
mate of the classroom through these methods made possible the con- 
clusion that in the classrooms of teachers using newer-type practices, 
many more opportunities were offered to pupils for self-initiated, co- 
operative, and leadership activities. It was similarly possible to note 
how teacher personality and practices affected the democratic atmos- 
phere in the classroom. 

Current attempts to measure social climate of groups are repre- 
sented by the studies of Lippitt and his associates. In one of his stud- 
ies Lippitt (6) used observational time sampling as well as sociometric 
techniques to measure the effect of authoritarian and democratic lead- 
ership upon the atmosphere or morale of a group. The significant con- 
tribution of the work of the group dynamics school lies in their defini- 


tion of behavior units in such a way as to make them accessible to 
observation and classification. 


Pupil-Teacher Rapport Scale (17) 


In several studies in the New York City schools, it was desirable to 
obtain a measure of the social climate of classrooms. Instead of using 
the observational time-sampling techniques, which are costly in time 
and personnel, a rating scale designed to yield a measure of pupil- 
teacher rapport was formulated. The assumption underlying the con- 
struction and use of this scale was that by observing and rating im- 
portant components of pupil-teacher rapport, an assessment of the 
classroom climate could be obtained economically by a series of sev- 
eral visits to the classroom. 

The design of the observation scale was as follows: Identification 
data were recorded for such items as name of teacher, school, class 
or grade, number of pupils, date, name of observer, subject or topic 
of class activity, type of class activity observed, and the group struc- 
ture of the class during the period of observation. The observer 
checked or rated the items on the scale which provides a description 
and a measure of the social climate of the classroom. Categories listed 
in the scale were the following: Pupil-teacher interaction pattern, de- 
gree of social interaction, quality of social interaction, interest, enjoy- 
ment, role structure, emotion of teacher, teacher orders or sugges- 
tions, physical tension of group, and emotion 

Some categories were rated on a four-poi 
point scale. The rating schedules used in 
reproduced below: 


al tone of pupil group. 
nt scale, others on a five- 
two of the categories are 


Evaluating School and Teaching Practices 435 


Degree of Social Interaction Ratings 

1. No interaction (e.g., listening to lec- 
ture, silent seat work) ——— ees 

. Infrequent interaction (occasional 
conferences with teacher) n l ee 

3. Interaction (natural pupil interaction 
in semi-free situations) a (eee umoenod 

4. Frequent interaction (e.g., inter- 
change in free situation) SS SSS ue 

5. Maximal interaction (e.g., confer- 
ence in subgroups) ——M——— 


to 


Emotion of Teacher Ratings 

1. Aggressive (openly hostile-sarcastic, 

etc., toward pupils) ———— —— 
2. Irritable (tone of irritability in deal- 

ing with pupils) = — — 
3. Toleration (teacher is straining to 

keep from irritability) ——— a 
4. Pleasant-reserved (friendly and re- 

served with depth of contact) —————— 
5. Warm and sympathetic (sympa- 

thetic, “good fellow" relations) a SS 


Determination of reliability and validity of such rating scales pre- 
sents some difficulty. In the above-mentioned study, the degree of 
agreement among several observers who made the ratings was ob- 
tained and the percentage of agreements on the ratings was compiled. 
It was found that the reliability of ratings was high; in all situations, 
experienced and trained observers who had used the scale for at least 
five to ten classroom observations showed approximately ninety per 
cent agreement on the item ratings they made in the same classroom 
situation. In order to measure the validity of the scale, supervisors 
were asked to make notes about the rapport between teachers and 
children in selected classrooms. These records were rated by a jury 
and compared with the data obtained by means of the observational 
scale. From the comparisons, it became evident that the scale had a 
high degree of correspondence, or correlation, with the “anecdotal 
records” kept systematically by supervisors over a period of several 
months. 


Evaluation of Teaching Practices 


Tests or inventories are available to measure teacher knowledge in 
child and adolescent psychology and attitudes toward modern con- 


436 Evaluating Major Objectives and Situations 


cepts and practices concerning the role of the school in present-day 
society. The purpose of the How I Teach inventory is to measure what 
teachers know about the wants, needs, problems, developmental status, 
and incipient personality disturbances of children and adolescents. 
The purpose of What Should Our Schools Do? is to measure attitudes 
of teachers, parents, and citizens toward progressive trends and prac- 
tices in modern education. These instruments are discussed as meth- 
ods for evaluating teaching practices. 


The “How I Teach" Inventory (5) 


Kelley and Perkins have designed an instrument which provides a 
comprehensive evaluation of the teacher's knowledge and insights in 
child and adolescent psychology. The inventory measures what teach- 
ers know about the wants, needs, problems, developmental status and 
incipient personality disturbances of children and adolescents. Items 
in the test are derived from four sources: (a) literature in the area of 
child and adolescent psychology as well as in other areas of psychol- 
ogy, (b) the case histories of problem children, (c) students descrip- 
tion of teachers liked and disliked, and (d) observations noted dur- 
ing visits to classrooms. Items are grouped by teaching practices, opin- 
ions, and factual results of experimental study. The teacher's responses 
on the inventory are scored in terms of norms set by the judgments of 
recognized authorities in the areas of child, adolescent, and clinical 
psychology, as well as by authorities in mental hygiene, psychiatry, 
and other fields. Many safeguards regarding validity of results were 
observed in the construction of the test. One consisted in having the 
inventory filled out by teachers rated as "plus" by administrators and 
principals and also by teachers rated as “minus.” 
were significant. 

In using the inventory, the teacher is asked to rate a number of 
actions or practices in terms of what his own practice is (or would 
be) in dealing with the problem or situation, using a five-point rating 
scale: 1—decidedly harmful, 2—probably harmful, 3—doubtful value, 


4—probably good, 5—decidedly good. The following are typical prac- 
tices which may be rated: 


" Differences in scores 


l. Requiring an additional assi 
disorderly 
2. Threatening to punish the pupil who tells lies 


3. Warning the pupil who masturbates that it leads to poor 
health 


gnment from a pupil who is 


Evaluating School and Teaching Practices 437 


4. Refusing to allow pupils to talk without first obtaining 
permission 


This self-appraisal of teaching practices should be of importance to 
the teacher in evaluating his own understanding of the principles of 
mental hygiene against the background of the theory and practice de- 
veloped by psychologists and mental hygiene workers. A teacher's re- 
sponses to this inventory indicate his fitness for the responsibility of 
dealing constructively with youthful personalities in the classroom. 
Evaluation of such teacher insights are of equal importance with the 
evaluation of his general mental ability, general cultural attainment, 
and mastery of subject matter. 


"What Should Our Schools Do?" (9) 


What Should Our Schools Do? is the name of a questionnaire which 
deals with the attitude of the teacher toward the role of the school 
in modern society. Specifically, it measures the degree to which teach- 
ers are aware of and in agreement with progressive ideas concerning 
the role of the school. The instrument consists of 100 statements with 
which the teacher indicates his agreement or disagreement. 

There are seven categories of school policy covered by the state- 
ments. Some of these areas are: 


Willingness to accept change in the local educational pro- 
gram and research, or experimentation in various aspects of 
the local school situation (21 items) 

Readiness to accept an intellectually tolerant point of view; 
one of freedom from personal bias or prejudice (17 items) 
Desire to broaden the curriculum or to take the school out 
of the classroom and into the community life, and vice versa 
(9 items) . 
Acceptance of possible related consequences involved in a 
program of liberalizing the educational program (8 items) 


An interesting relation between responses on the How I Teach and 
the What Should Our Schools Do? instruments is noted by Remmers 
(11:467). The scores of 170 college students of education answering 
both schedules show a rather high correlation. He concludes that stu- 
dents who have progressive ideas concerning pupil behavior and ad- 
justment problems will tend to have more progressive ideas concern- 
ing the role of the schools in our society. 


438 Evaluating Major Objectives and Situations 


Evaluation of Teachers 


Since the teacher plays such an important role in influencing the 
personal growth and scholastic achievement of pupils, any thorough 
evaluation program should include evaluation of the teacher himself. 
Aside from the routine evaluation of teachers by supervisors and prin- 
cipals, there are three major forms of teacher evaluation. Of primary 
significance are techniques of self-evaluation which constitute a “dem- 
ocratic process of self-supervision based on the assumption that the 
teacher is eager for self-improvement” (11:451). A second area of 
teacher evaluation is the rating of teachers by pupils, and a third area 


is national examinations, which create norms of teacher ability against 
which an individual teacher may evaluate himself. 


Teacher Rating Scales 


Many rating scales are constructed to allow superintendents, super- 


visors, and principals to rate individual teachers. Remmers (11:467) 
expresses the view that this type of evaluation h 


The Torgerson Diagnostic Teacher Rating Scale 
tivities (published by the Public School Publish 
Ill.) illustrates the appraisal of different char 
sociated with teaching success. An excerpt fr 


as severe limitations. 
of Instructional Ac- 
ing Co., Bloomington, 
acteristics ordinarily as- 
om this scale follows: 
Discussion Period 

a. Class discussion limited to brightest pupils 

b. Majority of pupils participate in the discussion 


c. Majority of pupils show lack of interest in the 
discussion 


d. Teacher discourages discussion or questioning 
e. Discussion period seldom provided 

The Torgerson Scale is composed of eighteen items, as follows: as- 
signment, discussion period, pupil diagnosis, remedial instruction, 
drill material, measurement of individual differences, provision for in- 
dividual differences, technique of measuring results, sequence of top- 
ics, types of criticism, pupil attention, results of motivation upon pu- 
pils, pupil activity, attention to heating, lighting and seating, use of 
instructional materials, control over pupils, method of handling prob- 
lem cases in discipline, and corrective measures. Studies of teacher 
rating scales devised show varying degrees of validity. The rater's 
skill, knowledge, and familiarity with the aspects of the teacher rated 
are important. The element of rapport between rater and individual 
being rated is a further factor to be considered, 


Evaluating School and Teaching Practices 439 


Rating of Teachers by Pupils 


The attitude of children toward their teacher is of primary impor- 
tance in the acquisition of knowledge and in achieving objectives of 
emotional and social growth. Pupils are in continuous contact and in- 
teraction with their teachers. They have an opportunity to observe 
and watch them. The ratings of teachers by pupils may be used as 
part of the evidence of the degree of teaching success. Three scales 
which have been devised are discussed below. 


The Purdue Rating Scale for Instructors (2) 


The Purdue Rating Scale for Instructors is a graphic rating scale 
covering ten areas in which the pupils are asked to rate their teachers: 
Interest in the subject, sympathetic attitude toward students, fairness 
in grading, liberal and progressive attitude, presentation of subject 
matter, sense of proportion and humor, self-reliance and confidence, 
personal peculiarities, personal appearance, ability to stimulate intel- 
lectual curiosity. The rating scale is essentially a three-point scale, but 
graphic representation permits intermediate ratings. One rating scale 
item, dealing with interest in subject, is presented below. 


Interest in subject 


Rs lo Ned OEE OR a i) rcv a Pa VS DE pr 


always appears full seems mildly subject seems irksome 
of subject interested to him 


The Diagnostic Teacher Rating Scale (14) 


This scale by Tschechtelin is designed for elementary-school chil- 
dren in grades 4 to 8, and permits an indication of their attitudes to- 
ward their teachers. There are two parts to the scale, one diagnostic 
and the other general. In both parts seven features are considered: 
pupil’s liking for teacher; ability of teacher to explain; kindness, 
friendliness, and understanding; fairness in grading; discipline (keep- 
ing order); amount of work required; pupil's liking for lessons. 

The instructions are these: “Following are a number of questions 
about your teachers. Please answer them honestly. Your teachers will 
never know how you have rated them. Do not write your name on 
this sheet.” (This emphasis on honesty and anonymity applies to all 
rating scales of this nature; without it the validity of the data is 
limited. ) 


440 Evaluating Major Objectives and. Situations 


Some of the questions on the general scale are: How well do you 
like your teacher? How fair is your teacher in grading? How well does 
your teacher keep order with children? On the diagnostic scale, seven 
statements, scaled in accordance with Thurstone's method, are pre- 
sented to the pupils. An example of the rating scale is given below. 


I. Liking for Teacher 

. Is the one I like best 

. Is humorous at times 

. Keeps everything in the room neat 
. Is pretty 

. Is not polite 

. Always wears a frown 

. Is too grouchy 


NOP tn 


The Bryan-Yntema Rating Scale (3) 


This scale for the evaluation of the teacher by students is designed 
for use at the secondary-school level. The pupil states his opinion of 
the teacher by responding to thirteen questions, such as the following: 


What is your opinion concerning the sympathy shown by 
this teacher? 


What is your opinion concerning the ability of this teacher 
to explain things clearly? 


What is your opinion of the ability this teacher has to make 
the classes lively and interesting? 


The student rates the teacher on a five-point scale: excellent, good, 
average, below average, and poor. Each of these scale values is clearly 
defined for the student. For instance, average in the sympathy ques- 
tion is defined as "generally kind, considerate and friendly, but every 
once in a while fails to see the student's point of view." 

The frequencies are multiplied by the values assigned to each level, 
the products are added, and the sum is divided by the total number 
of raters to obtain an average for each of the ten items listed in the 
illustration which follows. These average ratings are then translated 
into a scale from 60 for poor to 100 for excellent. In addition to the 
ratings, the student is requested to state reasons for his high (or 
favorable) and low (unfavorable) ratings. 

Knowledge of the ratings can help a teacher make c 


ertain improve- 
ments in personality or procedures ( 


11). It is interesting that a change 


Evaluating School and Teaching Practices 441 


in one or two practices has the result of bringing about higher ratings 
in other aspects as well, in some cases due to halo effect. 

The illustrative summary of ratings and reactions by thirty-one stu- 
dents to a chemistry teacher were obtained by use of the Bryan- 
Yntema Rating Scale. The chart, which follows, provides the averages 
obtained on the ten key questions of the scale. These averages in the 
second column present a profile of student judgment about various 
abilities and personal characteristics of the teacher. 


CHEMISTRY TEACHER 


Reactions Given by 31 Twelfth-Grade Students 


Item Average Key 
1 98 Knowledge of subject matter 
2 87 Ability to explain 
8 92 Fairness in marking 
4 91 Discipline 
5 92 Sympathy 
6 90 Amount of work teacher does 
7 83 Ability to make class interesting 
8 88 Ability to plan work 
9 81 Voice 
10 86 General teaching ability 


The chemistry teacher received very favorable ratings, or averages, 
on knowledge of subject matter, fairness in marking, discipline, sym- 
pathy, and the amount of work he does. Favorable but lower aver- 
ages were given on ability to explain, ability to make class interesting, 
ability to plan work, and voice. There is little doubt that the students 
have a high regard and liking for this teacher and his abilities gener- 
ally. The diagnostic, free-response comments of students follow. The 
numbers in parentheses after favorable and unfavorable comments in- 
dicate the number of students making such comments. 


Favorable Comments: Unusual background of experience 
and ability in the field, well informed on subject (17); it is 
always possible to absorb his vast knowledge of subject mat- 
ter, which is expressed in terms excellent for taking notes 
(8); anxious to assist students in problems which affect them 
most (7); very human; a good man; a pleasant voice—al- 
ways easy to hear (5). 

Unfavorable Comments: Voice is nasal and monotonous 
(6); speech is slow and deliberate (9); allows class to in- 


442 Evaluating Major Objectives and Situations 


terrupt too much (7); not enough class participation (2); 
too much theory and not enough practical application (2); 
some subjects are drawn out beyond interest of class; some 
topics are handled too sketchily; gives midterm exam. 

Interpretation: The averages show that the students as a 
group have high regard and great respect for this teacher. 
However, in two items, "Voice" and "Ability to make class 
interesting" the averages, 81 and 83 respectively, while not 
so low as to cause great concern, suffer by comparison with 
the other high averages. 


The most significant item here seems to be the reaction of the stu- 
dents to the voice of the instructor. Six commented on the nasal qual- 
ity and nine mentioned slow speech. While little can be done for 
nasal quality, except perhaps a drastic surgical operation, a deter- 
mined effort by this teacher to speed up his speaking rate will un- 
doubtedly aid in changing the reaction of the students. It is probable 
that an improvement in this area will also aid in bringing up the aver- 
age on "Ability to make class interesting." 

Seven students commented on the fact that the teacher "allows the 
students to interrupt too much." Two said that there was not enough 
class participation. The disagreement here seems to be a matter of 
degree. The implication in the first statement is that some participa- 
tion is desirable. Since this item has a bearing on making the class in- 
teresting, the teacher would do well to look into the matter and de- 
velop a solution to make it satisfactory to the majority of the students. 

The rating of teachers by students has been opposed on many 
grounds; that students are not competent, that irrelevant factors may 
distort the judgment of students, that the teacher is thus forced to 
cater to majority wishes, that pupil ratings tend to disrupt teacher 
morale. Remmers and Gage (11) have given effective and reasonable 
evidence in refutation of each of these charges. 


National Teacher Examinations (12) 


These examinations purport to measure the professional back- 
ground, mental ability, and general cultural knowledge of teachers. 
Their primary purposes are to aid in the selection and appointment 
of qualified and capable teachers. They are also intended as encour- 
agement and guidance of the professional growth of teachers. Al- 
though many school systems require national teacher ex 


m amination rec- 
ords as prerequisites to employment, it is granted th. 


at good teaching 


Evaluating School and Teaching Practices 443 


involves more than the abilities measured by these tests. It involves a 
variety of personal, interpersonal, and intellectual qualities which are 
not accessible to measurement by paper-and-pencil tests. Nevertheless, 
information about teachers’ mental ability and knowledge is impor- 
tant. Tests and examinations are an objective source for providing 
this data about teachers. They also make possible a comparison of 
teachers in terms of their scores. 

The National Teacher Examinations provide a comprehensive sur- 
vey of abilities and knowledge believed to be important in teaching. 
The following tests constitute the examination program: 


THE COMMON EXAMINATION BATTERY 


1. Professional Information 
Education as a Social Institution 
Child Development and Educational Psychology 
Guidance and Measurement in Education 
General Principles and Methods of Teaching 
2. General Culture 
History, Literature, and Fine Arts 
Science and Mathematics 
8. English Expression 
4. Nonverbal Reasoning 


There are also optional examinations which are to indicate the 
mastery of the subject matter to be taught. Some of these are: Edu- 
cation in the Elementary School, English Language and Literature, 
Social Studies, Biological Sciences, and Physical Sciences. 

The results of the examinations are used to compute nationwide 
norms for teachers in training and teachers in service. The perform- 
ance of the individual teacher may then be evaluated in terms of these 
norms. However, “acceptable” scores vary from school system to 
School system. Each school may set its own policy regarding the im- 
portance of the test results in relation to such other data as personality 
ratings, records of training and experience, and interview material. 

Five purposes of the National Teacher Examinations have been 
stated by the National Committee on Teacher Examinations of the 
American Council on Education (10): (a) The examinations provide 
measures of academic achievement to supplement college credentials, 
(b) The examinations provide a consistent basis for measuring some 
of the important qualifications of teaching applicants, (c) The ex- 
aminations provide a measure of breadth of background of general 
educational development, (d) The examinations provide data which 


444 Evaluating Major Objectives and Situations 


may be used directly in the preparation of eligibility lists in the large 
urs systems, and (e) The examinations provide assistance in re- 
allocation of teaching loads. 


Summary 


Evaluative criteria, rating scales and methods, inventories, ques- 
tionnaires, observational techniques, and tests of professional knowl- 
edge are used to evaluate school and teaching practices. In order to 
evaluate the total school program and practices, schedules and check- 
lists of evaluative criteria for both elementary and secondary schools 
have been formulated. The criteria are based upon best current prac- 
tices in curriculum, instruction, and organization. These best prac- 
tices are used as the norms by which local school practices may be ap- 
praised by self-evaluation committees and visiting committees, as in 
the case of the Evaluative Criteria developed by the Cooperative 
Study of Secondary School Standards. 

Evaluation of the climate of a classroom may be obtained by sys- 
tematic observations of the patterns of pupil-teacher rapport, or rela- 
tionships. Among the characteristic relationships observed and eval- 
uated are the degree and quality of social interaction among pupils, 
evidence of pupil interest and enjoyment, and the emotional tone of 
the pupil group. Evaluation of other teaching practices has been 
made by such inventories of attitudes and opinions as “How I Teach." 
The teacher responds to described situations involving insights into 
child and adolescent psychology. 

Evaluation of the teacher and teaching effectiveness has involved 
the use of rating methods and tests. For administrative purposes and 
records, teacher rating scales are sometimes used by principals or 
supervisors. The Torgerson Diagnostic Teacher Rating Scale is illus- 
trative of this rating method. For self-evaluation purposes, the Purdue 
Rating Scale for Instructors or the Bryan-Yntema Rating Scale may 
be used for the rating of teachers by pupils. The pupils are given an 
opportunity to rate the teacher on such characteristics as discipline, 
fairness, sympathy, or planning ability. For measurement of profes- 
sional information, mental ability, and general cultural knowledge, 
objective tests have been devised by the National Teacher Examina- 
tions. Since the role that the teacher plays in guiding the personal, 
social, and academic growth of pupils is so important, improvement of 
methods and techniques for evaluating personal Characteristics and 
professional competence of teachers is highly desirable, 


Evaluating School and Teaching Practices 445 


i; 


12. 


18. 


Problems for Class Discussion 


Select a section of one of the scales of evaluative criteria and apply this 
section to a school situation. Prepare a report on your application of the 
criteria. 


. Use the pupil-teacher rapport scale in several classrooms. What similar- 


ities and differences in classroom climates are observed? To what factors 
may they be attributed? 


. Ask several of your associates to rate you on a teacher rating scale. 


Make a summary of the ratings and report on your strengths and weak- 
nesses as revealed by the raters. 


References Cited in This Chapter 


. Baker, James F., Elementary Evaluative Criteria. Boston: School of 


Education, Boston University, 1953. 


. Brandenberg, G. C., and Remmers, H. H., The Purdue Rating Scale for 


Instructors. Lafayette, Indiana: Lafayette Printing Co., 1928. 


. Bryan, R. C., and Yntema, O., A Manual on the Evaluation of Student 


Reactions in Secondary Schools. Kalamazoo, Mich.: Western State 
Teachers College, 1939. 


. Cooperative Study of Secondary School Standards. Evaluative Criteria. 


Washington, D. C.: American Council on Education, 1950. 


. Kelley, Ida B., and Perkins, Keith J., How I Teach. Minneapolis: Educa- 


tional Test Bureau, 1941. 


. Lippitt, Ronald, “An Experimental Study of Authoritarian and Demo- 


cratic Group Atmosphere,” University of Iowa Studies in Child Welfare, 
Vol. 16, No. 8, 1940. 


. McCall, W. A., Herring, J. P., and Loftus, J. J., School Practices Ques- 


tionnaire. New York: Laidlaw Brothers, Inc., 1937. 


. Morrison, J. C., and Ruegsegger, v., A Scale for Rating Elementary 


School Practice. Albany, New York: University of the State of New 
York, 1943. 


. Mort, Paul R., Cornell, F. G., and Hinton, N. H., What Should Our 


Schools Do? New York: Bureau of Publications, Teachers College, 
Columbia Univ., 1938. 


. National Committee on Teacher Examinations. National Teacher Ex- 


aminations. American Council on Education, August, 1945. 


. Remmers, H. H., and Gage, N. L., Educational Measurement and Eval- 


uation. New York: Harper & Brothers, 1943. 

Ryans, D. G., “The Use of National Teacher Examinations in School 
Systems,” Teacher Selection Papers and Reports No. 13, American 
Council on Education, September, 1948. 

Shane, H. G., “A 1950 Census of Evaluation Practices," Educational 
Leadership, 8:73-77, November, 1950. 


446 Evaluating Major Objectives and Situations 


14. Tschechtelin, Sister M. Amatora, and Remmers, H. H., Diagnostic 


Teacher Rating Scale. Lafayette, Indiana: Division of Educational Ref- 
erence, Purdue University, 1940. 


i i 2 hool Inventory. 
5. University of the State of New York. Elementary Sc 
^" Albany, New York: State Department of Education, 1941. 


16. Wrightstone, J. W., "Analyzing and Measuring Democracy in the 
Classroom," Nation's Schools, 11:31-35, May, 1933. 


17. Wrightstone, J. W., "Measuring the Social Climate of a Classroom," 
Journal of Educational Research, 44:341-351, January, 1951. 


References for Further Reading 


Barr, A. S., and others, “Second Report of the Committee on Criteria of 
Teacher Effectiveness,” Journal of Educational Research, 46:641—658, 
May, 1953. 


This report presents an analytical and constructive program for identi- 


fying and validating criteria of teacher effectiveness. It is one of the most 
recent and comprehensive statements available. 


Beecher, Dwight E., Evaluation of Teaching. Syracuse, N. Y.: Syracuse 
University Press, 1949, 
This provides an overview of 


the problems involved in the evaluation 
of teaching and reviews methods of appraisal that have been applied. 


Remmers, H. H., and Gage, N. L., Educational Measurement and Evalua- 
tion. New York: Harper & Brothers, 1943, 


Chapter 19 of this volume on “The Teacher” provides a comprehensive 
statement of methods and techniques used to evaluate the teacher. The 
major emphasis is upon methods of teacher self-evaluation, 


APPENDIX. | Basic Statistical Concepts 


In order to understand or use test results wisely, a 
teacher or school officer must be familiar with selected statistical con- 
cepts. In addition to the interpretation of statistical test norms, which 
have been described in the chapters of this volume, other statistical 
concepts are essential A study by Mathews' showed the following 
most often encountered: construction and interpretation of charts, 
tables, and distributions; computation of measures of central tendency; 
quartile points, percentiles, and quartile deviations; interpretation of 
standard deviation and correlation coefficients. In a like manner, 
Dickey ? made an analysis of statistical concepts in professional journals 
and encountered the following most frequently: 


a. Measures of central tendency: mean, median, average 
b. Measures of variability: standard deviation, range, quartile 
deviation 
c. Correlation: Pearson r, rank order correlation 
In view of these systematic studies and the general experience of 

teachers in courses in educational measurement, the following statisti- 
cal concepts and procedures will be explained and illustrated in an 
elementary form: tabulation and classification, graphic representa- 
tions, measures of central tendency, measures of variability, and meas- 
ures of correlation. 


TABULATION AND CLASSIFICATION 
Suppose three teachers of fifth-grade classes have administered a 


reading test to 33, 33, and 82 pupils, respectively, in their classes. 
One of the first steps in the handling of test scores is tabulation and 


1 Mathews, Chester O., “The Introductory Course in Educational Measurements," 
Educational Administration and Supervision, 21:431—447, September, 1935. 
? Dickey, John W., “Statistical Ability Necessary to Read Education Journals," 
Journal of Educational Psychology, 27:149-154, February, 1936. 
447 


ad Appendix 


i i i ined the fol- 
lassification. In the class of 32, for example, pupils obtained 
uius scores: 72, 68, 66, 63, 61, 58, 57, 55, 54, 53, 52, 51, 49, 48, 4T, 
46, 46, 45, 43, 41, 40, 38, 37, 36, 34, 33, 30, 28, 24, 19, 13, 8. This 
teacher has already arranged the test scores in descending order from 
72 to 8. 


Frequency Distribution This series of scores may be condensed 
into even more concise form for the 32 pupils. This is done by set- 
ting up the scores in step intervals of 5. They may be classified and 
tabulated as shown in Table 1 below. Such a classification of scores 
is called a frequency distribution, because the frequency of occurrence 
of scores for each step interval is indicated in the form of a table. 
Some suggestions for making the frequency table are: (1) deter- 
mine the range, which is the difference between the highest and low- 
est scores; (2) select a class interval that will be convenient for tabu- 
lation; (3) write the limits of the class intervals in a left-hand column; 
and (4) tally the scores by making a short line for each score oppo- 


site the class interval into which it falls, and count these lines for 
the frequency of each class. 


Frequency Distribution 
TABLE 1 
of Scores on a Reading Test 


Score Interval Tally Frequency 
70-74 y 1 
65-69 // 2 
60-64 // 2 
55-59 //1 3 
50-54 //// 4 
45-49 1L 6 
40-44 /// 3 
85-39 /// 3 
30-34 /// 8 
25-29 / 1 
20-24 / 1 
15-19 Z 1 
10-14 y I 

5-9 / 1 
Number 32 


This table indicates a frequency of 1 score in the step interval 70—74, 
2 in the step interval 65-69, 2 in the step interval 60-64, 3 in the 


Basic Statistical Concepts 449 


step interval 55—59, etc. It is easy to see the advantage of grouping 
Scores into a frequency distribution and using step intervals, particu- 
larly if the number of cases which are to be handled is large. If each 
of the three teachers made such a frequency distribution and all 
were combined, they would give the frequencies shown in Table 2. 


GRAPHIC REPRESENTATIONS 


Histogram Sometimes teachers or supervisors wish to have the dis- 
tributions of scores represented graphically. For this purpose the fre- 
quency histogram may be used. It is simple and easily made and 
shows at a glance the distribution of scores achieved by pupils. If a 
frequency histogram were made of the data in Table 2, it would 
appear as a series of bar graphs placed one next to the other in a 
vertical array. The first bar representing the 75-79 interval would be 1 
unit high, the second bar representing the 70-74 interval would be 2 
units high, the third bar representing the 65-69 interval would be 5 
units high, and the fourth bar representing the 60-64 interval would 
be 8 units high, and so forth. 


22 22 
20 20 
18 18 
16 16 
14 14 
12 12 
10 10 
8 8 
6 6 
4 4 
2 2 
0 ie} 


5-9 


a 
Ivey 
1 
ui 


75-79 
70-74 
50—54 
45-49 
-44 
-39 
0-34 
-29 
20-24 
15-19 
10-14 


65-69 
60-64 


p] E 
Qo BA 


ricunE 1 A Frequency Histogram 


The advantages of such concise summarization on test data into the 
frequency distribution, or its graphic equivalent, the frequency histo- 


450 Appendix 


gram, permits easy visualization and estimation of such statistics as 
median, mean, quartile, quartile deviation, and standard deviation. 

When data are summarized in a frequency table and statistics are 
computed, it is important to remember that the assumption is made 
that all the cases in a step interval are distributed evenly throughout 
the interval. When the number of cases is sufficiently large, this as- 
sumption is well founded. When the number of cases is small, the 
error that may follow basing computations on this assumption is 
slightly larger than occurs when many cases are involved. 


MEASURES OF CENTRAL TENDENCY (AVERAGES) 


Measures at the center or middle of a distribution are called meas- 
ures of central tendency. These measures are sometimes called aver- 
ages. In statistics average is used as a general term to include the 
mean, median, mode, and other measures of central tendency. In 
everyday language, "average" is used to denote the sum of a number 
of measures or quantities divided by their number. Statisticians, how- 
ever, use the term mean in place of arithmetic average. 


Median The most commonly used average in educational literature 
is the median. Its ease of computation may account for this. The 
median is that point on the scale which divides the total number of 
measures or cases into two equal groups. For example, if there are 
98 cases, the median is a point at or above which 49 of the cases lie, 
and at or below which the remaining 49 lie. It is the fiftieth percentile 
point of a distribution. The method of computing the median in a 
frequency distribution and the steps of the process are summarized 
in Table 2. At the same time, the first quartile (Qi) and the third 
quartile (Q5) are indicated for later reference. 


Note: On account of statistical theory, for computational purposes 
the intervals are considered as beginning .5 of a point below the 
designation. Thus, 5 = 4.5, 10 = 9.5, and so forth. 

To summarize, the median, or fiftieth percentile, and any other per- 
centile points are located as follows: 


(1) Multiply the total number of frequencies by the given 
per cent or percentile, 

(2) count up the frequency column as far as possible with. 
out passing the required point, 

(3) determine the fractional distance into the next step in. 
terval to the required point, 

(4) multiply this fraction by the size of the step interval, and 


Bi Hy — 
asic Statistical Concepts = 


(5) add this result to the lower limit of the step interval in 
which the median or other percentile is located. 


Computation of the 
Median and Quartiles 


Be ee 


TABLE 2 


SCORE  |rnEQUENCY 
INTERVALS wx STEPS IN THE COMPUTATION 
peel Bau ee o 
15-79 Median 
70-74 Step 1. Since half (50%) of the frequencies is re- 
65-69 quired, 14 of 98 = 49. 
60- Step 2. To locate the approximate median, count up 
64 the frequency column to the interval con- 
55- taining the 49th frequency, or the top of the 
59 40-44 interval. This provides 37 cases with 
50-54 12 more required to make 49. 
45 Step 3. The step interval containing the median is 
9-29 45-49, which contains a frequency of 21; 
40-4. thus, the median is 12/21 of the distance up 
4 this interval of 5; hence, 12/21 x 5 = 60/21 
35-39 = 2.86, which is the correction to be added. 
30 Step 4. Add the correction 2.86 to the lower limit, 
-34 44.5, of the interval 45-49, and the median 
25-29 is 47.36. 
20-9 4 Qi and Qs 
15- Follow the same steps as for median, except that 
19 Qı = X and Q = 34 of the frequencies. 
10-1 Qi = % of 98 = 24.5; 5/138 X 5 = 2.5/18 = .19; 
$ 39.5 + .19 = 39.69. 
5-9 Qa = 3 of 98 = 73.5; 1.5/10 X 5 = 7.5/10 = .75; 
—MM acl 54.5 + .75 = 55.25. 


M 
e" p mean is a more precise mathematical statistic than the 
and divi à e mean is obtained by adding a series of scores together 
e sith’ the sum by the number of cases involved. This provides 
etic average, or mean. The mean, jn contrast with the 


H ! 
asd Appendix 


gram, permits easy visualization and estimation of such statistics as 
median, mean, quartile, quartile deviation, and standard deviation. 
When data are summarized in a frequency tabl i 


ura p eA 


3 Guess a mean (GM, 
Sipi the center of the dij 
tion; here tho mid 
the 45-49 inten 
GM = 47. 


Step 6. Add the correction to 
guessed mean 47.00. 

47.00 + .05 = 47.05 or the 
true mean. 


Standard Deviation 


Step 1. Compute the (f) X (d)?, 
as in Column 5 of the 
table. 


Step 2. Obtain the sum of (f) X 
(d)?, which here is 035. 


Step 3. Divide 035 by 98 which 
yields 6.4794, 


Step 4. Subtract the correction | 
squared, which is .0001. 


Step 5. Extract the square root of 
6.4793, which yields 2.54. . 


Step 6. Multiply 2.54 by the in- 
terval 5, and the standard 
deviation is 12.7. 


Statistical Concepts 453 


involves a contribution of size from each of the scores in the 
the median is influenced only slightly by any unusual 
s at the ends of the frequency distribution. 

The steps for computing the mean by the "short method" are: (1) to 
guess a mean near the center of distribution, (2) to lay off the devia- 
tions of intervals from the guessed mean, (3) to multiply the fre- 
quency of each interval by its deviation, (4) to obtain the algebraic 
Sum of the frequencies times the deviations, (5) to compute the cor- 
rection for guessing and to multiply it by the interval, and (6) to add 


this result to or subtract it from the guessed mean. 
he mean obtained by this method will vary slightly from the 


an obtamed by actually adding all the individual scores and di- 
ng by the total frequencies. Asa rie, Ye ANS iN WV WS 


D have no practical significance in the interpretation of this 


of a distribution is that point on a scale at which 
e found. In a grouped distribution, or frequency 


his point is sometimes 


scores around this average 
range, the quartile deviation, and the"ste 
monly used to show the spread of scores. 


Range The range of a series of scores or other measures is the 
tance from the lowest to the highest measure. Thus, the range of à 
series of test scores of which the lowest is 15 and the highest 58 is 


from 15 to 58, or 43 points. 


Quartile Deviation (Q) The quartile deviation is ordinarily that 
measure of deviation, or variability, of a distribution of scores which 
is associated with the median. It is one half the difference between 


452 
Computation of the 
ENS Mean and Standard Deviation 
al 2 3 4 5 

aren | quency | amon | () x (| (D x (@)" 
VALS (f) (d) 
75-9 | 1 6 6 36 
70-74) 2 5 10 50 
65-69] 5 4 20 80 
60-64 8 3 24 72 
55-59 10 2 20 40 
50-54 | 14 1 14 T 
45-49 21 0 
40-44 13 -1 zB 13 
35-39] 9 ETIN 36 
3034| 8 -s 2 72 
25-99| 3 AN E 48 
20-24} 1 -5 -5 25 
1-19} 1 X -8 96 
10-14] 1 =F aq 49 
5-9 1 -8 -8 64 

N = 98 +94 635 
—93 
GM - 47 +1 


Appendix 


STEPS IN THE 
COMPUTATION 


Mean 


Step 1. Guess a mean (GM) near 


the center of the distribu- 
tion; here the midpoint of 
the 45-49 interval is the 
GM = 47. 


Step 2. Lay off the deviations (d) 


from the guessed mean, as 
in Column 3 of the table. 


Step 3. Multiply each (f) by its 


(d), as in Column 4 of the 
tablo. 


Step 4. Obtain the algebraic sum 


of the (f) X (d) column, 
which here are +94 and 
—93 or +1, 


Step 5. Compute the correction. 


+1+ 98 = 01 
01 X 5 = .05. 


Step 6. Add the correction to the 


guessed mean 47.00. 
47.00 + .05 = 47.05 or the 
true mean. 


Standard Deviation 


Step 1. Compute the (f) X (d)?, 


ns in Column 5 of the 
table. 


Step 2. Obtain the sum of (f) x 


(d)?, which here is 635. 


Step 3. Divide 635 by 98 which 


yields 6.4794. 


Step 4. Subtract the correction 


squared, which is 0001. 


Step 5. Extract the square root of 


6.4793, which yields 2.54, 


Step 6. Multiply 2.54 by the in- 


terval 5, and the standard 
deviation is 12,7, 


M UM O o a A 


Basic Statistical Concepts 453 


median, involves a contribution of size from each of the scores in the 
series, but the median is influenced only slightly by any unusual 
scores at the ends of the frequency distribution. 

The steps for computing the mean by the "short method" are: (1) to 
guess a mean near the center of distribution, (2) to lay off the devia- 
tions of intervals from the guessed mean, (3) to multiply the fre- 
quency of each interval by its deviation, (4) to obtain the algebraic 
sum of the frequencies times the deviations, (5) to compute the cor- 
rection for guessing and to multiply it by the interval, and (6) to add 
this result to or subtract it from the guessed mean. 

The mean obtained by this method will vary slightly from the 
mean obtained by actually adding all the individual scores and di- 
viding by the total frequencies. As a rule, the difference is so slight 
as to have no practical significance in the interpretation of this 
statistic. 


Mode The mode of a distribution is that point on a scale at which 
most measures are found. In a grouped distribution, or frequency 
table, the true mode cannot be determined by inspection. Its calcu- 
lation under such conditions is difficult and complicated to explain. 
In a less precise sense, however, the mode is applied to-any point on 
a scale where the frequencies are largest. This point is sometimes 
called the major mode. In Tables 2 and 3, the major mode theoretically 
would be 47, the midpoint of the interval 45-49. 


MEASURES OF VARIABILITY 

Frequently it is not enough to know the average—median or mean— 
score of a distribution because two groups of pupils may have the 
same average, but the spread, scatter, variability, or deviation of the 
scores around this average may be different. Such measures as the 
range, the quartile deviation, and the standard deviation are com- 


monly used to show the spread of scores. 
series of scores or other measures is the dis- 


o the highest measure. Thus, the range of a 
t is 15 and the highest 58 is 


Range The range of a 
tance from the lowest t 
series of test scores of which the lowes 
from 15 to 58, or 43 points. 


Quartile Deviation (Q) The quartile deviation is ordinarily that 
measure of deviation, or variability, of a distribution of scores which 
is associated with the median. It is one half the difference between 


454 Appendix 


the points at the twenty-fifth and seventy-fifth percentiles in a fre- 
quency distribution of measures. These are Q, and Qs, respectively. 

For making comparisons between groups of pupils who have taken 
the same test, the quartile deviation is one of the most easily com- 
puted methods of scatter, or variability. It is known also as the semi- 
interquartile range. It is one half the distance from the first to the 
third quartile. In formula form it may be indicated thus: Q equals 
Qs minus Q; divided by 2. Using this formula for the data in Table 2, 
we find that 55.25 minus 39.69 is 15.56, which divided by 92 is 7.78. 
If we assume that the distribution of scores is symmetrical, then the 
distance from the first quartile to the median would be exactly the 
same as from the median to the third quartile. In the case of the table, 
this would mean that if the distribution were exactly symmetrical, 
Qı would be 7.78 below the median 47.36 and Qs would be 7.78 
above the median. Unless a distribution is decidedly skewed in its 
shape, the percentage of cases included within the distance of Q on 


both sides of the median constitutes approximately 50 per cent of the 
cases. 


Standard Deviation The standard deviation is one of the most fre- 
quently used measures of deviation, or variability. It is always based 
upon the squares of the actual deviations of each score from the mean. 
In a normal distribution, a distance of one standard deviation on either 
side of the mean includes 34.18 per cent of the area of the curve or, 
in other words, of the number of cases. Therefore, 68.26 per cent of 
the cases in a normal distribution lie not more than one standard 
deviation above and below the mean. 

For the "short method" of calculation (Table 3), the steps for the 
calculation of the standard deviation of a grouped series of scores are: 
(1) multiply each frequency by its deviation squared, (2) find the 
sum of these fd?, (3) divide this sum by the N, or total frequencies, 
(4) subtract the correction squared, (9) extract the square root of 
this result, and (6) multiply by the interval size, which in this in. 
stance is 5. The standard deviation thus obtained may be used for a 
variety of purposes, one of the most common of which is to compare 
the relative homogeneity or heterogeneity of groups of pupils who 
have taken the same test. 


MEASURES OF CORRELATION 


The product-moment method of computing the coefficient of correla- 
tion is most frequently used. For a small number of cases, usually not 


Basic Statistical Concepts 455 
Computation of the 
TABLE 4 
Product-Moment Correlation 
SCORES | DEVIATIONS] DEVIATIONS SQUARED | PRODUCT OF DEVIATIONS 
x y x 
31 24 —4 
36 34 +1 
36 36 +1 
30 29 —5 
38 36 +3 
37 36 T2 
28 24 -7 
37 81 +2 
36 31 +1 
34 27 —1 
38 36 T3 
38 35 +3 
40 35 +5 
34 32 =] 
GMx = 35| +21 
GMy = 32| —18 
+3 
EE ee es ———— 
e. = +3414 
ex? = 04 3 
SD, = 155 + 14 = 11.07 — 04 = 11.03 SD. = V/11.03 = 3.32 
SD,? = 250 + 14 = 17.86 —.02 = 17.84 — SD, = V/17.84 = 4.22 


456 Appendix 


more than 30, the data are generally arranged in two columns, the corre- 
sponding entries in which constitute a pair of measures, for example, 
scores on a reading test and a history test. For more than 30 cases, a 
correlation or double entry table is most convenient and economical. 

The formula for product-moment correlation is the sum of the cross 
products of deviations of the corresponding pairs of measures from 
their means divided by the product of the standard deviations of the 
two distributions. This yields the coefficient of correlation. 

A simple form of this method of calculating the coefficient of corre- 
lation is illustrated in Table 4. This form is used when the number 
of cases is small. The basic formula, however, is exactly the same as 
for a large number of cases. It is: the coefficient of correlation equals 
the sum of the products of the corresponding deviations divided by 
the number of cases; from this quotient the product of the corrections 
of the guessed means are subtracted; this result is divided by the 
product of the standard deviations of the two distributions. 

In this formula the standard deviations are computed in exactly the 
same manner as described in Table 3 on mean and standard deviation. 
The entire process of calculation is illustrated in Table 4. 


STEPS IN COMPUTING PRODUCT-MOMENT CORRELATION 


Step 1 For each series of scores x and y obtain the deviation of each 
score from its mean. The true mean may be used, but since the true 
mean is rarely a whole number, it is usually more convenient to work 
from an assumed or guessed mean. The guessed mean of x is selected 
as 85, and that of y as 32. Each deviation of an x score from the 
assumed mean is represented by x, and each deviation of 
from its assumed mean is represented b 
of the deviation must be indicated. 


a y score 
y y. The plus or minus signs 


Step 2 Obtain the squares of each deviati 
headed x? and y?. 


on. These columns are 
Step 3 Obtain the product of each pair of deviations. That is, each 
x is multiplied by its corresponding y. The first three products are as 
follows: —4x —8 = 82; -1xX22 29; 41x 4= 4; etc. The other 


products are obtained in a similar manner. 


Step 4 Substituting in the formula, the cross product sum 157 is the 
algebraic sum of the xy column, divided by N, which is 14. This yields 
11.21. c, is the algebraic sum of the x column divided by N, and Cy is 


Basic Statistical Concepts 457 


the algebraic sum of the y column divided by N. In this illustration 
c, is +.21, and c, is —.14. Their product, CxCy, is —.03. Since the for- 
mula requires the subtraction of the correction, we have 11.21 — 
(—.08) or 11.21 + .08 = 11.23. Remember that in algebra a minus 
sign before a minus quantity changes the quantity to a plus. The SD. 
and SD, are obtained by the usual formula. The product of SD;SD; 
is 8.82 x 4.22 or 14.01. The final calculation is 11.23 + 14.01 or .80, 
which is the coefficient of correlation. 

The coefficients of correlation range from a maximum of +1.00 
down to zero and to —1.00. A perfect correlation means that each 
pupils score in one series corresponds exactly with his score in the 
other series on the basis of the relative size of the scores. If the agree- 
ment is perfect a value of +1.00 is obtained. If the disagreement is as 
large as possible a value of —1.00 is obtained. If there is no relation- 
ship at all, neither agreement nor disagreement, the value is .00. Inter- 
mediate values indicate various degrees of agreement or disagreement. 

Correlation has practical uses. Practical uses of correlation are 
mainly to discover the relationships between various factors or abili- 
ties in learning. Another use is in test construction. Scores for odd 
items of a test are correlated with scores for the even items of each 
pupil in a group to determine the coefficient of reliability of a test. 
These few indications of the uses of correlation show only elementary 
application to educational and testing problems. 

Knowledge of statistical techniques will assist the teacher or super-| 
visor in making a better analysis and interpretation of test data. The: 
teacher, however, should be very cautious in making generalizations 
regarding statistical data until a comprehensive knowledge and un- 
derstanding of statistical theory and practice has been achieved. 


Glossary 
460 


arithmetic mean: the sum of a set of scores divided by the number of scores 
(Syn.: average, mean). 


average: a measure of central tendency, such as the arithmetic mean, the 
median, and the mode. 


attitude test: an instrument for measuring the pattern of likes and dislikes 
characteristic of a person or group of persons. 


battery: a group of tests standardized on the same population, so that re- 
sults on the individual tests are comparable; any group of tests which 
are administered as part of a single testing program. 

bias: in sampling, the selection of cases in such 
representative, thus giving rise to a system 


blueprint (for a test): an outline for a te 


matter to be covered by the test and the 
With respect to each subject-matter area. 


à way that a sample is not 
atic error. 


st indicating both the subject 
types of behavior to be elicited 


case conference: a meeting of the individuals involved in a case study (usu- 
ally the school psychologist, the teacher, the guidance counselor, the 
School nurse) to review the findings concerning a child and to plan 
future treatment. 

case study: a diagnostic study of an individual, embodying a careful ex- 
amination of the physical and psychological factors that are significant 
in the life of the person, undertaken in order to reveal the causes of edu- 

cational or behavior difficulties. 


centile: see percentile. 


checklist: a selected list of items, re 
of skills, or a group of ideas, 
check to denote the presence or 


presenting forms of behavior, a sequence 
following which an Observer records a 
absence of whatever is being observed. 


dex ranges in value 
absence of relationship, to either +1.00 or —1.00, 
on, respectively, 
coefficient of equivalence: an estimate of test reliability obtained by admin- 
istering two parallel forms of a test. 


coefficient of internal consistency: an estimate of test reliability obtained 
from the single administration of a test or instrument to a representative 
group of individuals. 


from .00, denoting an 
denoting perfect posit 


coefficient of stability: an estimate of test reliability obtained by administer- 
ing a given test to the same individuals after an intervening period of 
time. 


completion item: a test question in whi 
demonstrate comprehension by fillin 


Glossary 461 


complex process: a mental activity involving a relatively high degree of 
organization and control, such as reasoning and imagination. 


correction for guessing: a correction applied to scores obtained on multiple 
choice or true-false tests, on the assumption that a person will give 
correct answers, by chance, to a certain proportion of the items they do 

E R-W 

not know. The formula is: Corrected Score — Wa? where R is the 
number of items right, W is the number of items wrong, and N the 
number of alternative choices. 

cumulative frequency distribution: a table of the frequencies of a series of 
scores in which each step shows the sum of all frequencies up to and 
including its own. 

cumulative record: a summary of a pupil’s educational history, providing a 
fairly complete record of his achievement, attendance, health, extra- 
curricular activities, etc. 


decile: a name given to every tenth percentile. The first decile is the 10th 
percentile, the second decile is the 20th percentile, etc. 

derived score: a score obtained from a raw score by some statistical tech- 
nique, e.g., a standard score, a grade score, or IQ. 

descriptive rating scale: a rating scale in which the possession of a given 
degree of some trait is represented by checking one of several descrip- 
tive statements, such as: always prompt; few latenesses, always justifi- 
able; many inexcusable latenesses. 

deviation: the amount by which a score differs from a measure of central 
tendency, as mean or median or some other reference value. 

diagnostic test: a test used to determine specific weaknesses and deficiencies 
in a given area. 

difficulty of a test item: a characteristic of a test item, usually measured in 
terms of the proportion of a given group who answer the item correctly. 

directed interview: an interview in which the interviewer, usually a guid- 
ance counselor, asks direct questions and suggests courses of action to 
the interviewee. 

discriminative power of a test item: the ability of a test item to differentiate 
between persons having relatively greater and relatively lesser amounts 
of some trait. 

distractor: an incorrect choice offered in a multiple choice or matching test 
(Syn.: foil). 

educational age (EA): a pupils average performance, in terms of age 
scores, on a number of achievement tests in different school subjects. 


educational quotient (EQ): the ratio of educational age to chronological 
age; EA -+ CA; an index of the pupil's achievement relative to that of 


pupils of his own age. 


Glossary 
462 


losely parallel in 

i forms: two or more forms of a test that are so c 4 llel 

a of functions measured and item difficulty that they yield similar 
average scores and the same dispersion (Syn.: alternate forms, parallel 
forms, comparable forms). 


error analysis: a count of the frequency of specific types of errors, such as 


computational errors, made by pupils on a given test (Syn.: error 
count). 


evaluation: the measurement and appraisal of a comprehensive range of 


objectives, defined in terms of pupil behavior, through the use of a 
variety of techniques. 


evaluative criteria: the factors and standards considered by an evaluative 
agency in assessing the total school program of an educational institu- 
tion. 

essay test: a test calling for a relativel 
situation, in that the pupil may b 
etc. 


y free written response to a problem 
e asked to discuss, compare, describe, 


extrapolation: the process of estimating 
range of available data. Ordinaril 
test beyond the limits of the stan 
interpretation of extreme scores. 


the values of a function outside the 
y applied in extending norms for a 
dardization group, in order to permit 


factor analysis: a method of analyzing the interrelationships among a num- 


ber of variables to describe test performance with the least number of 


relatively independent factors or to describe the nature of the basic 
processes that influence test performance. 


first quartile (Q,): the 25th percentile; 
per cent of the cases fall. 


forced choice: a rating technique in which the rater is forced to choose be- 
tween paired alternatives which are equal in "preference" value, i.e., 
are equally acceptable to persons who are used as raters, but which 
discriminate between individuals who score high and low on the trait 
which the alternatives measure. 


the point on a scale below which 25 


fore-exercise: a trial exercise or series of 
taking a test with th 
procedure to be follo 


items designed to acquaint a person 
e nature of the items to be administered and the 
wed in the actual test. 


frequency: the number of cases falling at a given score, or within any class 
of scores. 


frequency distribution: a tabulation of the 


frequencies of scores or groups 
of scores, arranged in 


order of magnitude, 
grade equivalent: the grade score 


assigned to a 
representing the average grad. 


e level of pupi 


grade norms: values representing typical or aver. 
ent grade groups. 


given raw score on a test, 
Is obtaining that score. 


age performance for differ- 


Glossary 463 


graphic rating scale: a rating scale on which the possession of a given degree 
of some trait is represented by a check on a straight line. 


group dynamics: the complex pattern of interrelationships within a group 
producing or governing the actions of or movements within the group. 


group test: a test that may be administered to a number of persons at the 
same time by a single examiner. 


halo effect: in rating individuals, the tendency to allow one's general esti- 
mate of an individual to influence the rating of independent traits; in 
rating responses to essay questions, the tendency to allow a rating of a 
preceding response to influence the rating of a following response. 


individual differences: the observed differences between individuals in a 
given characteristic. 

individual test: a test which can be administered to only one person at a 
time. 

informal test: generally, a teacher-made test constructed for use in a given 
classroom or school. 

intelligence quotient (IQ): originally, the ratio of mental age to chronologi- 
cal age (MA + CA). More frequently, in present-day test construction, 
the term refers to a deviation IQ, found by determining the difference 
between an individual's obtained score and that which is normal for 
the individual's age. 

inventory: a test or a checklist of an individual's abilities, personal char- 
acteristics, or interests. 

inventory test: a test which attempts to measure rather completely a rela- 
tively limited range of an individual's knowledge or skill; often used as 
a preliminary measure prior to instruction in a subject. 


item: a single exercise or question in a test. 


item analysis: the process of determining the validity of a test item, through 
a consideration of the difficulty and discriminating power of the item. 


log: a record of activity in a given area, such as after-school play, visits to 
library, etc. 


man-to-man rating scale: an approach to rating in which an individual is 
rated by comparing him with other persons known to the rater. 


matching exercise: a test exercise which calls for the correct association of 
each item in one list with an item in a second list. 
mean: see arithmetic mean. 


median: that point or score value which divides the cases in a frequency 
distribution into two equal parts; the 50th percentile. 


mental age (MA): the age for which a given raw score on an intelligence 
test is average or normal. 


Glossary 
464 


odal age: the age or age range characteristic of pupils of specified grade 
In: > 
placement. 


odal-age norms: norms based on the performance of pupils who are of 
i modal age for their grade. 


mode: the score or value that occurs most frequently in a distribution of 
Scores. 


multiple choice items: a test item which calls for the selection of the one 
correct or best answer from several possible answers or options. 


N: the symbol commonly used to d 
sample or distribution. 

non-directive interview: 
help the interviewe 
tions and advice. 


esignate the number of cases in a given 


a technique in which the interviewer attempts to 
e gain self-insights while avoiding direct sugges- 


normal curve: a curve representing the distribution of d 
of abilities, based on large numbers of cases ra 
a theoretically infinite population. 


norms: the average test performance of 
grade groups, obtained in the process of standardizing a test, 


numerical rating scale: a rating scale in which the rater is instructed to 
assign a number, generally from zero to 10, to represent the degree to 
which an individual is judged to possess the trait in question. 


ata, such as measures 
ndomly selected from 


given groups, such as age groups or 


objective: a desired change in the behavior of a pupil as the outcome of 
experiences organized and directed by school personnel. 

objective test: a test in which the subjective opinion or judgment of the 
Scorer does not enter. 


observation: the process of observin, 
for a definite period of time 
behavior during that period. 
omnibus test: a test in which items measuring a v 


ariety of functions are 
presented in a single sequence rather than in separate subtests, and 
give rise to a single score, 


£ the activity of an individual or a group 
and recording the occurrence of specified 


oral trade test: a test wh 


ich gauges an individual's knowled, 
trade or occupation; 


designed to be given orally. 
order of merit scale: see rank- 


ge of a given 


order rating scale. 


paired comparison rating: a rating technique in which each individual in 
turn is judged as bett 


er or worse than every other individual in the 
group. 


percentile: that point in a frequency distribution below which fall the per 
cent of the cases indicated by the 


particular percentile. Thus, 57 per 
cent of the cases fall below the 57th percentile, 


Glossary 465 


percentile norms: the percentile score assigned to a given raw score on a 
test; allows one to determine at what percentile a pupil of a given age 
or grade falls. 


performance test: generally, a type of test in which emphasis is placed on 
motor responses, rather than responses involving paper and pencil. 
Also, a type of test in which the use of language is reduced to a mini- 
mum. 


personality test: a test designed to measure one or more of the non-intellec- 
tive characteristics of an individual’s psychological organization. 


personal report: an appraisal device in which the individual to be evaluated 
expresses his own opinion concerning his behavior, feelings, or traits. 


power test: a test which is designed to determine an individual’s level of 
performance rather than his speed of response; time limits are not set 
or are very generous. 


pretest: a test given to an individual or a group in order to determine status 
in some area prior to instruction in that area. 


practice effect: the influence of previous experience in taking a test upon 
later administration of the same or a similar test. 


probable error: a measure of dispersion; a range of one probable error on 
either side of the mean of a normal distribution includes exactly 50 per 
cent of the cases. 


problem checklist: a self-reporting device which enables an individual to 
indicate what specific problems in given areas he considers troublesome. 


product-moment coefficient: see coefficient of correlation. 


product scale: a rating device in which a series of models arranged in order 
` of merit are used to assign a quantitative score to the quality of the 
product, generally an English composition or a specimen of handwrit- 

ing, being rated (Syn.: quality scale). 


profile: a graphic means of representing the results on a number of tests 
administered to an individual or group. 


projective technique: a method of studying the personality of an individual 
by having him respond to a series of relatively unstructured stimuli, 
such as ink blots, etc. 


prognostic test: a test designed to predict future performance in a given 
field. 


quality scale: see product scale. 


quartile: one of the three points which divide a distribution into four equal 
parts. 


quartile deviation: one-half of the range of the middle 50 per cent of a 
group of scores (Syn.: semi-interquartile range) . 


Glossary 
466 


i i btain an individual's responses 
i ire: an instrument designed to o i [ 
urs number of questions centering about a given problem or 
area. 


imi lation in such 
mple: a limited number of cases chosen from a popu 
rr that every case has an equal chance of being selected; a sample 
chosen in a purely chance manner from a population. 


range: the difference between the highest and lowest scores obtained on a. 
test by the members of a group. 


rank: the position assigned to a score in a series which is arranged in order 
of relative size. 


rank-difference correlation coefficient (e): a measure of relationship based 


upon differences in the ranks of paired values of the variables being 
correlated. 


rank-order rating scale: a rating technique in which persons being rated are 
placed in serial or rank order in accordance with the rater's judgment 
of the degree to which they possess the quality or trait under consider- 
ation. 


rate test: a test in which the items are approximately equal in difficulty, but 
in which the time limit set does not permit any individual to complete 
all the items, 


rating scale: a device for obtainin: 


E an individual's evaluation of a given 
characteristic in te 


rms of some objective scale of values. 
raw score: the first numerical score obtained when a test is scored. It may 


be expressed in terms of number of items answered correctly, 
taken to finish a test, etc. 


time 
given piece of reading material, in 
uired for reasonable comprehension. 


recall item: a test item that call 


s upon the individual to supply the correct re- 
sponse from his own mem 


ory, as in a completion item. 

recognition item: a test item that calls upon the individual to recognize 
and select a correct response from two or more alternatives, as in a 
true-false test, 


reliability: the degree of c 
it does measure. 


reliability coefficient: the co 


onsistency with which a test measures whatever 


efficient of correlation between two forms of 
a test, between scores on 


repeated administration of the same test, or 
between halves of a test. 


ple that matches the characteristics of the 
population from which it is drawn in all r 


espects that are important for 
the purposes under consideration, 


i 


Glossary 467 


sample: a number of cases drawn from all of the cases comprising a given 
population. 


scaled test: a test in which the items are arranged in order of increasing 
difficulty. 


scholastic aptitude: see academic aptitude. 


sentence-completion: a projective technique in which the individual is re- 
quired to expand a short phrase into a complete sentence. 


situational test: a technique for evaluating an individual's adjustment by 
observing his behavior in a variety of situations. 


skewness: the extent to which a frequency distribution departs from sym- 
metry around the mean. 


sociodrama: a projective technique in which an individual acts out a role 
in a situation. 


sociogram: a pictorial representation of the patterns of choice and rejection 
among the members of a group. 


sociometry: measurement of the interpersonal relationships existing among 
the members of a group. 


speed test: a test on which an individual may work only for a given length 
of time. 

split-half reliability: the correlation between scores on one-half of a test 
with scores on the other half. Generally, the two halves of the test 
consist of the odd- and even-numbered items. 


standard deviation: a measure of the variability or spread of a set of scores, 
indicating a range of scores above and below the mean score including 
approximately two-thirds of the cases in the distribution. 


standard error: an estimate of the difference between an obtained score and 
a hypothetical true score; the "error of measurement." 

standard score: a means of expressing an individual's score in terms of a 
deviation from the mean score of the group in relation to the stand- 
ard deviation of the distribution of scores, thus: standard score — 
raw score — mean | 
standard deviation 

standardized test: a sample of the performance of an individual or group ob- 
tained under prescribed conditions, scored according to prescribed 
rules, and interpreted by reference to normative data. 


subtest: one of a number of sections into which a test is divided; generally 
designed to measure a given aspect of the subject or area being tested. 


Survey test: a test that measures general achievement in a given subject or 
area. 


test-retest reliability: the correlation between scores obtained by administer- 
ing a given test for a second time after a short interval. 


470 

Science Research Associates, 57 
West Grand Ave., Chicago 10, 
TH. 


Stanford University Press, Stan- 
ford, California 


Directory of Publishers of Tests 


C. H. Stoelting Co., 424 North 
Homan Ave., Chicago 24, Ill. 


World Book Co., 313 Park Hill 
Ave., Yonkers 5, N. Y. 


Indexes 


Index of Authors 


Abrahamson, S., 421, 422, 423 

Alschuler, R. H., 354 

American Association of School Ad- 
ministrators, 411 

American Psychological Association, 
44 

Anderson, G. L., 190, 192, 197, 354 

Anderson, H. E., 115 

Anderson, H. H., 190, 192, 197, 354 

Armacost, G. H., 112, 114 

Ayres, L., 5, 6, 14, 261 


Baker, J. F., 428, 445 

Baldwin, B. T., 898, 410 

Barr, A. S., 224, 446 

Beecher, D. E., 446 

Bell, H. M., 9, 14 

Bell, J. E., 187, 191, 194, 197, 198, 
354 


Benson, A. L., 323 

Betts, E. A., 248 
Beverley, F., 280, 291 
Biber, B., 135 

Bingham, W. V., 155 
Bogardus, E. S., 374, 376 
Bovard, J. H., 399, 410 
Brandenberg, G. C., 445 
Briggs, T..H., 112; 114 
Brown, C., 287, 991 
Brown, S. A., 291 
Brueckner, L. J., 268 
Brush Foundation, 398 
Bryan, R. C., 440, 441, 445 
Buckingham, B. R., 6 


Buros, O. K., 197, 243, 256, 257, 261, 
267, 268, 278, 275, 280, 282, 285, 
287, 290, 291, 305, 306, 323, 336, 
376 


California School Supervisors’ Asso- 
ciation, 238 

Campbell, D. S., 278, 292 

Carter, H. D., 306 

Chadderdon, H., 287, 291 

Chall, J. S., 252, 268 

Chamberlin, D., 10, 14 

Cheydleur, F. F., 282, 291 

Conrad, C. C., 259 

Cook, L. A., 214 

Cook, S. W., 376 

Coombs, A. W., 354 

Cooperative Study of General Edu- 
cation, 409, 404, 405 

Cooperative Study of Secondary 
School Standards, 158, 171, 429, 
444, 445 

Cornell, F. G., 445 

Courtis, S. A., 5, 241 

Covner, B. J., 151, 154 

Cronbach, L. J., 42, 59, 197 

Crow, A., 216, 223 

Crow, L. D., 216, 223 

Cunningham, R., 171 

Curtis, F. D., 80, 99 


Dale, E., 252, 268 
Darley, J. G., 149, 150, 154 
Davis, A., 412, 423 


473 


474 


avis, B., 11, 14 
Tout E., 384, 392 
Dearbom, T. H., 406 
Dearborn, W. F., 75 
Denver Public Schools, 238 
Department of Instruction, Denver 
Publie Schools, 404 
Deutsch, M., 376 
Dewey, J., 298, 294, 306, 377, 392 
Dickey, J. W., 447 
Division of Research and Guidance, 
Los Angeles County, 185, 197, 
224, 938 
Dollard, J., 423 
Douglass, H. R., 110, 114 
Duncan, O. D., 493 


Ebel, R. L., 88, 99 

Eells, K. W., 15, 412, 414, 415, 416, 
418, 419, 493 

Ellingson, M., 185 

Elliot, E. C., 108, 115 

Eurich, A. G., 7, 98 

Evaluation of School Broadcasts, 
387, 388 


Federal Security Agency, United 
States Employment Service, 114 

Fels Institute, 398 

Flesch, R., 252, 268 

Flint, n 493 

Frank, L. K., 178, 197 

Freeman, F. N. a 259 

Friedman, B. B, 376 

Froelich, C. P. '823 

Fryer, D., 293, 306 


Cage, N. L., 88, 99, 197, 428, 449, 
445 


Gardner, B. B., 446 
Gardner, M. R., 446 

Gates, A. I., 185, 248 
Gerberich, J. R., 292 
Goddard, H. H., 6 

Good, C. V., 224, 336, 376 
Goodenough, F. L., 118, 185 
Grapko, M. F. , 218 

Greene, H. A., 292 


Index of Authors 


Grossnickle, F. C., 268 
Guilford, J. P., 167, 171, 310, 336 
Guttman, L., 49, 59 


Hamalainen, A. E., 185 
Hardaway, M., 289, 291 
Hartshorne, H., 354 
Hattwick, L. A., 354 
Havighurst, R. J., 412, 423 
Heil, L. M., 88, 92 
Herring, J. P., 427, 445 
Hildreth, G. H., 73, 75 
Hillegas, M. B., 6 

Hinton, N. H., 455 
Hollingshead, "A. De B., 12, 14, 423 
Holzinger, K. J., 99 
Howard, C., 24, 25, 28 
Hull, C. L., 311, 336 

Hunt, J. McV., 178, 197 


Isaacs, S., 383, 392 


Jahoda, M., 376 

Jarvie, L. L., 135 

Jennings, H. m p 213, 348 

Jensen, M. H., 

Jersild, A. T., Tie, 121, 185, 298, 
306 


Johns, N. B., 899, 400 
Jorgenson, A. N., 292 
Jourard, S. M., 918 
Justman, J., 218 


Karnes, M., 285, 291 
Kearney, N. C., 15 
Kelley, I. B., 436, 445 
Kelley, T. T. 59, 310, 336 
Kelty, M. G., 20, 28 

Kerr, W. A., 490 

Knox, I. B., 99, 114 

Koos, L. V., 155 
Krugman, J. L, 135 
Krugman, M., 183, 197 
Kuder, G. F. ,39, 48, 49, 59 


Landreth, C., 382, 392 
Lee, J. M., 79, 99 
Lehman, H. C., 112, 114 


Index of Authors 


Lentz, T. F., 178 

Lindquist, E. F., 75, 100, 115, 242, 
268 

Lippitt, R., 434, 445 

Loeb, M. B., 423 

Loftus, J. J., 445 

Lorge, I., 252, 253, 268 

Los Angeles, Division of Research 
and Guidance, 135 

Lundin, R. W., 280, 291 

Lunt, P. S., 414, 423 

Lynd, H., 493 

Lynd, R., 423 


Madsen, I. N., 75 

Maier, T. B., 289, 291 

Mathews, C. O., 447 

May, M. A., 854 

McCall, W. A., 427, 445 

McCloy, C. H., 399, 410 

McNemar, Q., 376 

Meeker, M., 15, 414, 415, 416, 418, 
419, 423 

Meigs, M. F., 118, 121, 185 

Meyer, G., 105, 110, 114 

Michaelis, U., 24, 25, 28 

Micheels, W., 285, 291 

Michigan Department of Public 
Instruction, 407 

Monroe, W. S., 8, 241, 411 

Mooney, R. L., 197 

Moore, B. V., 155 

Moore, N. E., 20, 28 

Moreno, J. L., 200, 213, 348 

Morrison, J. C., 425, 445 

Mort, P. R., 445 

Murphy, G., 173, 191, 193, 195, 197 

Murray, H. A., 37, 186, 189, 197 

Myers, C. R., 353, 354 


National Committee on Teacher 
Examinations, 445 

National Society for the Study of 
Education, 268, 291 

Netzer, R. F., 261, 268 

Neugarten, B. L., 418, 498 

Northway, M. L., 213 

Nyswander, D., 394, 410 


475 


Oberteuffer, D., 394 

Olson, W. C., 224, 339, 354 
Orleans, J. S., 75 

Otis, A. S., 6 

Overstreet, H. A., 392 


Pace, C. R., 28 

Perkins, K. J., 436, 445 

Personnel Research Section, Adjutant 
General's Office, 165, 171 

Pfautz, H. W., 493 

Piotrowski, Z. A., 197 

Plowman, L., 80, 99 

Price, M. A., 197 

Progressive Education Association, 
888, 390 

Pryor, H. B., 398, 410 


Raths, L. E., 8, 14, 18, 28, 375, 392, 
421, 492, 493 

Raye, H. C., 165 

Read, K. H., 287, 291, 382, 392 

Remmers, H. H., 20, 88, 99, 197, 
423, 437, 438, 442, 445, 446 

Reynolds, E. L., 410 

Rice, J. M., 4, 5, 12 

Richardson, M. W., 48, 49, 59 

Rinsland, H. D., 63, 75 

Rogers, C. R., 149, 150, 154, 347 

Rohde, A. A., 354 

Rosenzweig, S., 197, 354 

Ross, C. C., 88, 99 

Ruch, G. M., 231, 238, 393 

Ruegsegger, V., 425, 445 

Rugen, M. E., 394, 410 

Rugg, H. O., 170, 171 

Russell Sage Foundation, 261 

Ryans, D. G., 445 


Scates, D. E., 4, 14, 224 

Schenck, E. A., 282, 291 

Segal, D., 79, 99, 231, 238, 323 
Shane, H. G., 429, 445 

Shartle, C. L., 310, 336 

Shen, E., 169, 171 

Simmons, K., 410 

Sims, V. M., 99, 102, 109, 110, 114 
Smith, C. W., 75 


476 


; . 8, 14, 17, 28 
aa E. M 18, 100, 115, 376, 392 
Sontag, L. W., 410 
Spearman, C. S., 49 J 
Staff of Personnel Research Section, 
Adjutant General’s Office, 171 
Stalnaker, J. N., 115, 258, 268 
Stalnaker, R. C., 115 
Starch, D., 6, 108, 115 
Stone, C. W., 5 
Strang, R., 181, 197, 198, 223, 224 
Strong, E. K., Jr., 306 
Stroud, J. B., 80, 99 
Stump, N. F., 99, 112, 115 
Super, D. E., 307, 315, 336 
Symonds, P. M., 120, 185, 171, 190, 
197, 219, 224 


Tallmadge, M., 110, 114 

Tasch, R. J., 293, 306 

Terman, L. M., 6, 13 

Thorndike, E. L., 4, 5, 6, 12, 241 

Thorndike, R. L., 337 

Thouless, R. H., 392 

Thurstone, L. L., 11, 167, 171, 810, 
336 

Tomkins, S. S., 187, 188, 197 

Toops, H. A, 154. . 

Torgerson, T. L., 155, 171, 438 

Trabue, M. R., 6 


Index of Authors 


Traxler, A. E., 115, 123, 129, 135, 
154, 174, 178, 197, 216, 217, 224, 
238, 354 

Tschechtelin, M. A., 489, 446 

Tyler, R. W., 7, 14, 17, 28, 100, 115, 
171, 376, 392 


United States Department of Agri- 
culture, 407 

United States Employment Service, 
112, 114 


University of the State of New York, 
446 


Warner, W. L., 12, 14, 414, 415, 
416, 418, 419, 493 

Warren, H. C., 809, 336 

Wechsler D., 52 

Wesley, E. B., 20, 98, 273, 292 

Wetzel, N. C., 398, 410 

Wood, T. D., 398, 410 

Woodcock, L. P., 382, 399 

Woods, G. G., 80, 99 

Woody, C., 6 

Wrenn, C. G., 150, 154 

Wrightstone, J. W., 7, 14, 17, 18, 20, 
28, 185, 213, 278, 285, 999, 384, 
392, 433, 446 


Yntema, O., 440, 441, 445 


Index of Subjects 


Achievement tests, 249—945 
batteries, 243—245 
standardized, 242-243 

Analysis of errors, 72 

Anecdotal records, 33-34, 116, 123- 

138, 843-844, 364-366 
defined, 123 
interpretation, 139—138 
methods of recording, 128-132 
objectives evaluated by, 126-127 

Appreciation, literary, 256-257 

Aptitudes, 308-337 
art, 330-331 
clerical, 325 
defined, 308-309 
intelligence, 316-323 
mechanical, 326-330 
musical, 330-331 
occupational, 332 
subject-matter, 324 
test batteries, 339—334 
types of tests, 316-334 

Art, 276-280 

Art aptitude tests, 330-331 

Arts, industrial, 282-285 

Attitudes, 355-376 
checklist, 367 
defined, 356-357 
evaluation techniques, 358-375 
free response, 374-375 
importance of, 355-356 
inventory, 370-372 
observation, 370 
paired comparisons, 373 


Attitudes (Cont.) 
paper and pencil techniques, 370 
presentation of stimuli, 369 
social distance scale, 374 
Audiometer, 397 
Autobiography, 180-181, 347 


Basic skills, 241 
Blueprint (for a test), 81 
Business education, 287-288 


Case studies, 37-38, 215-224 
Central tendency, 450—453 
Checklists, 35-36, 156-163, 
167-170, 367, 368 

construction, 167-170 

types, 158-163 

uses, 156-158 
Classroom climate, 493-435 
Cleavages, 205-206, 210-212 
Clerical aptitude tests, 325 
Clique, 210-211, 412 
Completion test, 82-83 
Complex processes, 82, 88-94 
Correlation, 454—457 
Cumulative record, 38, 225-938 


Descriptive rating scales, 36, 
163-164 
Distractor, 86-87 


Equivalence, coefficient of, 49-50 

Errors, analysis of, 72-73 

Essay test, 34, 101-110 
criticisms, 103-104 


477 


478 


ay test (Cont.) 
kg ah es 104-110 
sampling, 104 
scoring, 103 
Evaluation, 3-28 
characteristics of good program, 
21-24 
definition of, 16 
design of program, 21-22 
development of modern, 4—7 
distinguished from measurement, 


follow-up studies, 9-10 
functions of, 4, 16 
interrelation with curriculum, 24 
origins and trends, 3-12 
purposes of, 16 
recent trends, 10-12 
Scope of, 7-10 
Steps in program, 17-21 
use in school systems, 24-26 
Evaluation program, 60—74 
adaptation to local needs, 61-62 
introduction of, 61-62 
organizing personnel for, 62 
role of teacher and supervisor, 
60-61 
use of tests in, 62-72 
Evaluation techniques, 29-56 
classification of, 29-38 
practicability of, 54-56 
qualities for judging, 42-56 
use in administration, 38-39 
use in guidance, 40-4] 
use in instruction, 39-40 
use in research, 41 
Evaluation techniques (health), 
396-409 
attitudes, 404—406 
conditions, 406—407 
information, 400-403 
interests, 403—404 
observation, 396-398 
outcomes, 408—409 
practices, 399—400 
self-evaluation, 400 
Evaluative criteria, 424, 498-433 


Factor analysis, 11-12 


Index of Subjects 


Forced-choice rating scale, 165 
Foreign language, 280-282 

Free association, 345-347 
Frequency distribution, 448-449 


Gestalt psychology, 10, 339, 342 
Glossary, 459-468 

Graphic rating scale, 36, 165-166 
Graphic representation, 449—450 
Group dynamics, 12 

Grouping pupils, 204 

Guessing, 81 


Halo effect, 170 
Handwriting, 259-961 
Health and physical development, 
893-411 
evaluation techniques, 396-408 
legislation, 393 
standards, 393-394 
Health instruction, 394—396 
objectives, 394—396 
Histogram, 449-450 
Home economics, 286-287 


Industrial arts, 289—985 
Informal tests, 11, 249-943 
Intelligence tests, 316-323 
Internal consistency, coefficient of, 
48-49 
Interests, evaluation of, 293-296 
categories, 301-302 
defined, 293 
importance in education, 293-296 
role of teacher, 296-297 
standardized devices, 302-305 
teacher-made devices, 298-299 


Interview, 35, 186, 148-152, 347— 
348, 366-367 


directed, 149-150 
nature of, 148-149 
non-directive, 149-150 
recording, 151—159 
techniques, 150-159 
types, 149-150 
Inventory, 35, 136—148 


Language arts, evaluation of, 241, 
246-269 


Index of Subjects 


Language arts (Cont.) 

reading, 246-257 

speaking, 260-262 

writing, 257—260 
Languages, foreign, 280-9892 
Literary appreciation, 256-257 
Log, 367 


Man-to-man rating scale, 36 
Manual dexterity tests, 395-326 
Matching test, 87 
Mathematics, 241, 262-268 
Mean, 451-453 
Measurement, 3-7 
differs from evaluation, 3—4 
Mechanical aptitude tests, 326-330 
Median, 450—451 
Mode, 453 
Motor proficiency, 399 
Multiple-choice test, 85-87 
distractor, 86-87 
Music, 276-280 
Musical aptitude tests, 330—381 


New-type tests, 79 
Norms, 42, 52-54, 69-72 
age, 58, 70-71 
grade, 53, 70 
percentile, 58, 71 
standard score, 58, 71-72 
Numerical rating scale, 36 


Objectives, 17-91 
clarification of, 19-20 
formulation of, 17-19 
selection of measures for, 20 
Objective tests, 30-33, 79-100 
psychological characteristics, 
80-31 
technical features, 31-33 
Objectivity, 42, 51-52 
Observation, 33-34, 116-123, 343- 
344, 364-369, 396-398 
advantages, 123 
defined, 118 
effect of observer, 121-122 
in lower grades, 364—368 
in upper grades, 367—369 
interpretation, 122-193 


479 


Observation (Cont. 
length of period, 120-121 
limitations, 193 
objectives evaluated by, 117 
place in evaluation, 117-118 
recording, 119-120 
Occupational aptitude tests, 332-337 
Oral examinations, 34, 110-115 
limitations, 110 
reliability and validity, 112 
values, 110-112 
Oral trade tests, 119-118 
Organismic psychology, 10 


Paired comparisons, 167 
Percentile, 450 
Personal reports, 36-37, 172-198 
Personality evaluation, 172-198 

autobiography, 180 

drawing and painting, 190-192 

handwriting, 195 

play techniques, 192-193 

problem check lists, 178-180 

sentence completion, 193-194 

story telling, 194—195 

tests and inventories, 176-178 
Personal-social adjustment, 338—354 

autobiography, 347 

defined, 338-389 

difficulties in evaluating, 339-340 

tree association, 345—347 

interview, 347-348 

observation, 343-344 

projective techniques, 345-347 

rating scales, 342-343 

sociometric techniques, 348-350 

situational tests, 850-352 
Physical growth, 398-399 
Planning a test, 94-98 

objectives, 94-95 

outlining, 95-97 

sampling, 96 

summarizing responses, 98 
Problem-solving, 377-392 
Product scale, 36, 259-261 
Projective techniques, 12, 86-37, 

172-198, 345-347 
Psychodrama, 351 


480 


Quartile deviation, 453—454 
artiles, 451 

SUA UN 35, 136-148 
administration, 146-148 
construction, 137-146 
errors, 142-144 
format, 145-146 
purposes, 139-142 
use, 137-139 


Range, 453 
Rank-order rating scale, 165, 167 
Rating scales, 35-36, 156—158, 163- 
171, 342-348 
construction, 167-170 
types, 163-167 
uses, 156-158 
Reading, 246-257 
literary discrimination, 256-257 
silent and oral, 248-959 
work-study skills, 253-955 
Reading readiness, 246-248 
Readability, 252-953 
Reliability, 42, 46-51 
Reliability coefficients, 47-50 
coefficient of equivalence, 49-50 
coefficient of internal consistency, 
48-49 
coefficient of stability, 50 
Rorschach test, 37, 182-186 


Sampling, 22 
School and teaching practices, 424— 
446 


Sciences, natural, 273-276 
Self-descriptive inventories, 341-342 
Self-evaluation, 400, 422, 432—433 
Sensorimotor, 396, 399 
Short answer tests, 79-100 

cost, 81 

defined, 79 

guessing, 81 

instructional uses, 80 

limitations, 81-82 

planning, 94-98 

sampling, 79, 96 

scoring, 80 

testing complex processes, 82 

types of items, 82-88 


Index of Subjects 


Short answer tests (Cont.) 
values, 79-80 
Situational tests, 350-352 
Social adjustment, 199, 202-204 
Social class, 413-421 
Social studies, 270-273 
Sociodrama, 351, 352, 369-370 
Socio-economic status, 419-428 
effect on pupil achievement, 412 
effect on pupil attitudes, 413 
effect on school organization, 414 
effect on vocational goals, 413 
methods of measuring, 414-420 
Sociogram, 210-212 
Sociometric methods, 87, 199-214, 
348-350 4 
construction and administration, 
206-212 
nomination technique, 200 
rating scale, 200-201 
Who's Who, 201-202 
uses, 202-206 
Sociometry, 12, 87, 199-200 
Speaking and listening, 260-262 
Specific determiner, 84 
Speech, 261-262 
Stability, coefficient of, 50 
Standard deviation, 452, 454 
Standard error of measurement, 47 
Statistical concepts, 447—457 
Status, 412-493 


Tabulation, 447—448 
Teacher rating scales, 488-446 
Testing, development of, 4-6 
Test item types, 79-88 
completion, 82-83 
true-false, 83-85 
matching, 87-88 
multiple choice, 85-87 
Test outline, 95-97 
Tests, 11, 30-84, 37, 79-98, 101- 
118, 176-178, 182-189, 249— 
262, 270-275, 280-288, 302— 
305, 316-332, 350-358, 370- 
372 
achievement, 249—945 
art aptitude, 330-33] 
attitude, 870-379 


J 
j 


Index of Subjects 


Tests (Cont.) 
business education, 287—288 
clerical aptitude, 325 
completion, 82-83 
essay, 34, 101-110 
foreign language, 280-282 
handwriting, 259-261 
home economics, 286-287 
industrial arts, 282-285 
informal, 11, 242-243 
intelligence, 316-323 
interests, 302-305 
language arts, 246-262 
manual dexterity, 325-326 
matching, 87 
mathematics, 262-268 
mechanical aptitude, 326-330 
multiple choice, 85-87 
musical aptitude, 330—331 
natural sciences, 273-276 
new-type, 79 
objective, 30-33, 79-100 
occupational aptitude, 332-337 
oral, 34, 110-115 
oral trade, 119-118 
personality, 176-178 
planning, 94-98 
reading, 246-257 
reading readiness, 246-248 
Rorschach, 37, 182-186 
science, 2738-275 
short answer, 79-100 
situational, 850-352 
social studies, 270-273 


thematic apperception, 37, 186- 
189 


481 


Tests (Cont.) 
trade knowledge, 283-285 
true-false, 83-85 
work-study skills, 253-255 
Tests, use in evaluation, 62-75 
administration of, 64-67 
construction of, 63 
interpretation of, 69—72 
scoring of, 67-69 
selection of, 63 
Thematic Apperception Test, 37, 
186-189 
Thinking, 377-892 
errors, 879-880 
evaluation techniques, 380-391 
relation to interests and needs, 378 
relation to values, 379 
teacher-made devices, 381-391 
Trade knowledge tests, 283-285 
True-false test, 88-85 


Validity, 42-46, 242, 269 
concurrent, 44 
construct, 45 
content, 45, 269 
curricular, 242 
predictive, 44 

Values: see Attitudes 

Values and thinking, 379 

Variability, 453—454 


Work-study skills, 253-255 
Writing, 257-260 
analysis of products, 257—258 
handwriting, 259-261 
mechanics, 257 


» 


